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(57) Abstract: The invention provides a nucleic acid molecule comprising: (a) a nucleotide sequence as shown in SEQ ID No. 35; 
or (b) a nucleotide sequence which is the complement of SEQ ID No. 35; or (c) a nucleotide sequence which is degenerate with SEQ 
ID No. 35; or (d) a nucleotide sequence hybridising under conditions of high stringency to SEQ ID No. 35, to the complement of 
SEQ ID No. 35, or to a hybridisation probe derived from SEQ TD No. 35 or the complement thereof; or (c) a nucleotide sequence 
having at least 80 % sequence identity with SEQ ID No. 35; or (0 a nucleotide sequence having at least 65 % sequence identity with 
SEQ ID No. 35 wherein said sequence preferably encodes or is complementary to a sequence encoding a nystatin PKS enzyme or a 
part thereof. Also provided are part of such molecules and polypeptides (and parts thereof) encoded by such a nucleic acid molecule, 
and the use of such molecules and polypeptides in facilitating nystatin biosynthesis and in the synthesis of nystatin derivatives and 
novel polyketide as macrolide structures. 
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GENE CLUSTER ENCODING A NYSTATIN POLYKETIDE 
SYNTHASE AND ITS MANIPULATION AND UTILITY 



10 



The present invention relates to the cloning and 
sequencing of the gene cluster encoding a modular 
polyketide synthase enzyme involved in the biosynthesis 
of the macrolide antibiotic nystatin. The invention 
thus relates to novel genes and nucleic acid molecules 
encoding proteins /polypeptides exhibiting functional 
activities involved in nystatin biosynthesis, such 
functional proteins/polypeptides themselves, and their 
uses both in facilitating nystatin biosynthesis and in 
15 the synthesis of nystatin derivatives and novel 
polyketide or macrolide structures. 

Polyketides are natural products synthesized by 
microorganisms, many of which have applied potential as 
pharmaceuticals or as agricultural or veterinary 
20 products. Examples of polyketides used in medical 
treatments include the antibiotics erythromycin 
(antibacterial), nystatin (antifungal), avermectin 
(antiparasitic) , rapamycin (immunosuppressant) and 
daunorubicin (antitumor) . The Gram-positive bacteria 
2 5 streptomyces are the main producers of polyketides, and 
the genetics and biochemistry of polyketide biosynthesis 
in these organisms are relatively well characterized 
(Hopwood et al., Chem. Rev. v. 97: 2465-2497 (1997)). 
Macrolide polyketide compounds are formed via repeated 
30 condensations of simple carboxylic acids by modular 

(type I) polyketide synthases (PKS) in a manner similar 
to fatty acid biosynthesis. The modular hypothesis 
proposed by Donadio et al. Science, v. 252: 675-679 
(1991) suggested that type I PKSs are organized in 
35 repeated units (modules) , each of which is responsible 
for one condensation cycle in the synthesis of a 
polyketide chain. This was proven to be correct by 
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manipulations of PKSs type I genes resulting in 
predictable changes in the chemical structures of 
macrolides. Beside condensation of the next carboxylic 
acid onto the growing polyketide chain, ensured by the 
5 catalytic activity of the 3-ketoacyl synthase (KS) 

domain, modules of PKSs type I may contain domains with 
(3-ketoreductase (KR) , dehydratase (DH) and enoyl 
reductase (ER) activities, which determine the reduced 
state of incorporated extender units. The 

10 acyltransferase (AT) and acyl carrier protein (ACP) 

domains present in each module are responsible for the 
choice of extender unit and retention of the growing 
polyketide chain on the PSK, respectively. Upon 
completion of synthesis, the polyketide chain is 

15 released from PKSs via action of a thioesterase (TE) , 
that is probably also involved in cyclization of the 
final product. Thus, PKSs type I represent an assembly 
line for polyketide biosynthesis, that can be 
manipulated by changing the number of modules, their 

20 specificities towards carboxylic acids, and by 

inactivating or inserting domains with reductive 
activities (Katz, Chem. Rev., v. 97, 2557-2575, 1997). 
After the polyketide moiety of a macrolide is 
synthesized and cyclized to form a macrolactone ring, it 

25 is usually modified via hydroxylation, glycosylation, 
methylation and/or acylation. These modifications are 
believed to be crucially important for the biological 
activities of macrolides. 

The genes for macrolide antibiotic biosynthesis in 

3 0 Strep tomyces are organized in clusters, and a number of 
such clusters have already been identified. 
Exploitation of recombinant DNA technology has made it 
possible to isolate complete antibiotic biosynthetic 
gene clusters by screening the gene libraries with DNA 

35 probes encoding PKSs (Schwecke et al. f Proc . Natl. Acad 
Sci. USA, v. 92: 7839-7843, 1995). The molecular 
cloning and complete DNA sequencing has been described 
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for several macrolide antibiotic gene clusters of 
Streptomyces, including those for avermectin, pikromycin 
and rapamycin (Ikeda et al., Proc. Natl. Acad. Sci. USA, 
- v. 96: 9509-9514, 1999; Xue et al . , Proc. Natl. Acad. 

5 Sci. USA, v. 95: 1211-12116, 1998; Schwecke et al., 
Proc. Natl. Acad. Sci. USA, v. 92: 7839-7843, 1995). 
Partial' cloning and DNA sequencing of the gene cluster 
for the polyene macrolide antibiotic pimaricin has 
recently been reported (Aparicio et al . , J. Biol. Chem., 

.0 v. 274: 10133-10139, 1999). However, a complete DNA 
sequence of genes for the biosynthesis of a polyene 
macrolide antibiotic with antifungal activity has not 
yet been disclosed. There is a need and desire to 
increase the repertoire of available antifungal 

L5 antibiotics, and/or to improve upon the properties (e.g. 
efficacy, toxicity, etc.) of existing drugs. Hence the 
provision of new antifungal treatments, particularly 
those exhibiting new or improved properties would 
represent a considerable advance in the art. 

20 The present invention is directed to this aim, and 

is based on the cloning and DNA sequencing of the 
nystatin biosynthesis gene cluster. This provides the 
first example of the identification of such antifungal 
antibiotic biosynthesis genes, as well as a tool for 

25 genetic manipulation in order to modify the properties 
of nystatin and/or the producing organism, or to obtain 
novel potentially useful compounds. 

The polyene antifungal antibiotic nystatin Al, the 
complete stereostructure of which (see Fig. 1) has been 

30 determined by Lancelin & Beau, Tetrahedron Lett, v. 30: 
4521-4524, (1989), is produced by Streptomyces noursei 
ATCC11455. From the structure of the nystatin molecule, 
which belongs to the class of macrolide compounds, we 
predicted that its. polyketide backbone is synthesized by 

3 5 a PKS type I enzyme. Based on this assumption, and as 
described in the Examples below, a genomic library of 
Streptomyces noursei ATCC11455 was screened using a 
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specially designed probe obtained by PCR using primers 
based on conserved amino acid sequences within known (S- 
ketoacyl synthase (KS) and acyl carrier protein (ACP) 
domains of known modular PKS enzymes. This led to the 
5 identification of a number of clones or fragments which 
we have sequenced and shown to contain parts or portions 
of the nystatin PKS gene cluster. We have further shown 
that alteration of the fragment sequences to inactivate 
the encoded product leads to abrogation of nystatin 

10 biosynthesis (see Example 1 below) , thereby confirming 
the requirement of the identified PKS for nystatin 
biosynthesis. Subsequent work on the clones/ fragments 
has lead to the sequencing of the PKS gene cluster, and 
the identification of the different modules and 

15 enzymatic domains, regulatory regions etc, within it. 

Furthermore, as will be described in more detail 
below, we have shown that manipulations of functional 
DNA sequences within the novel nystatin PKS gene cluster 
which we have identified, have led to the synthesis of 

20 novel molecular structures, e.g. nystatin derivatives 
with improved function. This opens up the exciting 
possibility of manipulating the nystatin Al PKS gene 
cluster to obtain not only beneficial new nystatin 
derivatives, but also to improve and facilitate the 

2 5 biosynthetic production process (for example to improve 

yield, or production conditions, or to expand the range 
of available host cells) or to provide novel compounds 
with new activities and/or properties. 

More particularly, two primary regions (or "parts" 

3 0 or "portions") of the nystatin PKS gene cluster were 

initially identified and sequenced, together 
representing approximately 80% of the nystatin PKS gene 
cluster, and the functional sequences within said 
regions (e.g. PKS genes, regulatory regions etc, as well 
3 5 as functional gene products, enzymatic domains etc.) 
have been identified and characterised. 

The first region ("Region 1"), which we have termed 
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"Nys 1" and the complete DNA sequence of which is shown 
in SEQ ID No. 1, has been shown to contain a number of 
PKS or associated genes or regulatory regions, and 13 
separate "features" or open reading frames (ORFs) have 
5 been identified (the amino acid sequences of the 

translation products of which are shown in SEQ ID Nos. 3 
to 15 respectively) . 

The second region ("Region 2"), which we have 
termed "Nys 2" , and the complete coding sequence of 

10 which is shown in SEQ ID No. 2, also comprises a number 
of "functional" regions, and 5 separate "features" or 
ORFs have been identified, the amino acid sequences of 
the translation products of which are shown in SEQ ID 
Nos. 16 to 20 respectively. 

15 Nys 1 and Nys 2 (i.e. SEQ ID Nos. 1 and 2 and the 

sequences they encode) are the subject of British Patent 
Application No. 0002840.7 filed on 8 February 2000. 

Subsequent sequencing efforts have led to the 
determination of the sequence of the DNA spanning the 

20 gap between SEQ ID Nos. 1 and 2, and the identification 
of novel genes in this region. In addition, the partial 
gene sequences contained in SEQ ID Nos. 1 and 2 encoding 
the gene products Nys I (SEQ ID No. 20) and NysDII (SEQ 
ID Nos. 3) (see further below) have been completed (see 

25 new SEQ ID Nos. 3 6 and 37 respectively - see further 

below) . Thus, these sequencing efforts have led to the 
identification and sequencing of the DNA region 
encompassing the entire nystatin PKS gene cluster, and 
the identification and characterisation of the 

30 functional sequences within this region. 

The complete coding sequence for (i.e. the complete 
nucleotide sequence encoding) the nystatin biosynthetic 
gene cluster is shown in SEQ ID No. 35. This has been 
shown to contain a number of PKS or associated genes or 

35 regulatory regions, and 23 separate "features" or ORFs 
have been identified (the amino acid sequences of the 
translation products of which are shown in SEQ ID Nos. 2 
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to 19, and 36 to 42 respectively) . 

The complete coding sequence for (i.e. the complete 
nucleotide sequence encoding) the nystatin biosynthetic 
gene cluster, as shown in SEQ ID No. 35, is the subject 
5 of British Patent Application No. 0008786.6 filed on 10 
April 2000 and British Patent Application No. 0009387.2 
filed on 14 April 2000. 

In one aspect, the present invention thus provides 
a nucleic acid molecule comprising: 
10 (a) a nucleotide sequence as shown in SEQ ID No. 35; or 

(b) a nucleotide sequence which is the complement of 
SEQ ID No. 35; or 

(c) a nucleotide sequence which is degenerate with SEQ 
ID No. 35; or 

15 (d) a nucleotide sequence hybridising under conditions 
of high stringency to SEQ ID No. 35, to the complement 
of SEQ ID No. 35, or to a hybridisation probe derived 
from SEQ ID No. 35 or the complement thereof; or 

(e) a nucleotide sequence having at least 80% sequence 
20 identity with SEQ ID No. 35; or 

(f) a nucleotide sequence having at least 65% sequence 
identity with SEQ ID No. 3 5 wherein said sequence 
preferably encodes or is complementary to a sequence 
encoding a nystatin PKS enzyme or a part thereof. 

25 a "nystatin PKS enzyme" is defined further below, 

but briefly in the context of section (f) above means an 
enzyme or protein or polypeptide that is functional in 
the synthesis, transport or transfer of a macrolide 
antibiotic or polyketide moiety, preferably nystatin or 

30 a nystatin derivative or nystatin-related molecule. 

In a further aspect, the present invention also 
provides a nucleic acid molecule comprising: 
(a) a nucleotide sequence as shown in SEQ ID No. 1 
and/or in SEQ ID No. 2; or 

35 (b) a nucleotide sequence which is the complement of 
SEQ ID No. 1 and/or SEQ ID No. 2; or 

(c) a nucleotide sequence which is degenerate with SEQ 
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ID No. 1 and/or SEQ ID No. 2; or 

(d) a nucleotide sequence hybridising under conditions 
of high stringency to SEQ ID No . 1 and/or SEQ ID No. 2, 
to the complement of SEQ ID No. 1 and/or SEQ ID No. 2, 

5 or to a hybridisation probe derived from SEQ ID Nos . 1 
and/or 2 or the complements thereof; or 

(e) a nucleotide sequence having at least 65% sequence 
identity with SEQ ID No. 1 and/or SEQ ID No. 2, wherein 
said sequence preferably encodes or is complementary to 

10 a sequence encoding a nystatin PKS enzyme or a part 
thereof . 

A nucleic acid molecule of the invention may be an 
isolated nucleic acid molecule (in other words isolated 
or separated from the components with which it is 

15 normally found in nature) or it may be a recombinant or 
a synthetic nucleic acid molecule. 

The nucleic acid molecule of the invention encodes 
(or comprises a nucleotide sequence encoding) the 
nystatin Al PKS enzyme, or a portion thereof e.g. a 

20 sequence encoding a single domain, or comprises a 

nucleotide sequence in the nystatin Al PKS gene cluster 
which is a functional or non- functional genetic element. 
More precisely, the nucleic acid molecule of the 
invention encodes one or more polypeptides, or comprises 

25 one or more genetic elements having functional activity 
in the synthesis of a macrolide antibiotic or a 
polyketide moiety, preferably nystatin or a nystatin 
derivative or nystatin-related molecule. Such 
functional activity may be enzymatic activity e.g. an 

30 activity involved in the synthesis or transport or 

transfer of a polyketide moiety or a macrolide molecule 
(this can be macrolide chain or ring synthesis or any 
step contributory thereto, or macrolide ring or 
polyketide chain modification etc) and/or it may be a 

35 regulatory activity, e.g. regulation of the expression 
of the genes or proteins involved in the synthesis, or 
regulation of the synthetic process, and/or it may be a 
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"transporter activity". Thus, included generally are 
also transport proteins involved in the transfer or 
transport of polyketide or macrolide moieties e.g. in 
the transport or efflux of the synthesised molecule 
5 within or out of the cell. Also included in this 
respect are glycosylation proteins which includes 
molecules involved in the biosynthesis and/or attachment 
of saccharides (e.g. mycosamine) to the macrolide or 
polyketide . 

10 Whilst nucleotide sequences encoding a desired 

product are preferred according to the invention, also 
encompassed are nucleotide sequences comprising 
functional genetic elements such as promoters, promoter- 
operator regions, enhancers, other regulatory sequences 

15 etc. Thus, the nucleic acid molecule of the invention 
need not comprise the entire PKS gene cluster but may 
comprise a portion or part of it e.g. a part encoding a 
polypeptide having a particular function or a regulatory 
sequence. This may comprise one or more genes, and/or 

20 regulatory sequences, and/or one or more modules or, 
enzymatic domains, or non-coding or coding functional 
genetic elements (e.g. elements controlling gene 
expression, transcription, translation etc) . 

In one such aspect, the invention provides a 

25 nucleic acid molecule as defined above, wherein said 

nucleotide sequence of SEQ ID No. 35 (or variant thereof 
as defined in (b) to (f) above) does not include the 
portions of the molecule comprising ORF 1 (see Table 1 
below) . In other words, in this embodiment, the 

30 nucleotide sequence of SEQ ID No. 35 does not comprise 
nucleotides 124026 to 125222. 

In another such aspect, the invention provides a 
nucleic acid molecule as defined above, wherein said 
nucleotide sequence of SEQ ID No. 35 (or variant thereof 

35 as defined in (b) to (f) above) does not include the 

portions of the molecule comprising ORF 2 (see Table 1 
below) . In other words, in this embodiment, the 
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nucleotide sequence of SEQ ID No. 35 does not comprise 
nucleotides 122812 to 123876. 

In a further such aspect, the invention provides a 
nucleic acid molecule as defined above, wherein said 
5 nucleotide sequence of SEQ ID No. 35 (or variant thereof 
as defined in (b) to (f) above) does not include the 
portions of the molecule comprising NysF (see Table 1 
below) . In other words, in this embodiment, the 
nucleotide sequence of SEQ ID No. 35 does not comprise 

10 nucleotides 454 to 1191. 

Alternatively, the invention provides nucleic acid 
molecules which contain a part of ORF 1, ORF 2 and/or 
NysF, or a modified sequence of ORF 1, ORF 2 and/or 
NysF, such that the expression or the function of the 

15 ORF 1, ORF 2 and/or NysF gene product is ablated. 

Included within the scope of the invention are 
nucleotide sequences which hybridise to SEQ ID Nos . 1 or 
2 or 35 or their complements, or to parts thereof (i.e. 
to hybridisation probes derived from SEQ ID Nos. 1 or 2 

20 or 35 which are discussed in more detail below) , under 
high stringency conditions and which preferably encode 
or are complementary to a sequence which encodes a 
nystatin PKS enzyme or part thereof. Conditions of high 
stringency may readily be determined according to 

25 techniques well known in the art, as described for 

example in Sambrook et al . , 1989, Molecular Cloning, A 
Laboratory Manual, 2nd Edition. Hybridising sequences 
included within the scope of the invention are those 
binding under non-stringent conditions (6 x SSC/50% 

30 formamide at room temperature) and washed under 

conditions of high stringency (e.g. 0.1 x SSC, 68°C) , 
where SSC = 0.15 M NaCl, 0.015M sodium citrate, pH 7.2. 

A hybridisation probe may be a part of the SEQ ID 
No. 1 or SEQ ID No. 2 or SEQ ID No. 35 sequence (or 

35 complementary sequence) , which is of sufficient base 
length and composition to function to hybridise to 
sample or test nucleic acid sequences to determine 
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whether or not hybridisation under high stringency 
condition occurs. The probe may thus be at least 15 
bases in length preferably at least 30, 40, 50, 75, 100 
or 200 bases in length. Representative probe lengths 
5 thus include 30-500 bases e.g. 30-300, 50-200, 50-150, 
75-100 . 

The hybridisation probe may be derived from a 
coding or non-coding, functional or non- functional part 
of the sequence (i.e. SEQ ID Nos . 1 or 2 or 35 or their 

10 complements) , and may for example correspond to a gene 
or module or to an enzymatic domain, or a part thereof 
(e.g. the part encoding the active site) or to a 
sequence which links enzymatic domains or modules. 
Thus, the hybridisation probe may have functional 

15 activity in polyketide/macrolide synthesis as defined 
above . 

Nucleotide sequence identity may be determined 
using the BestFit program of the Genetics Computer Group 
(GCG) Version 10 Software package from the University of 

20 Wisconsin. The program uses the local homology 

algorithm of Smith and Waterman with the default values: 
Gap creation penalty = 50, Gap extension penalty = 3, 
Average match = 10,000, Average mismatch = -9.000. 

Nucleotide sequences according to the invention may 

25 exhibit at least 65%, 70%, 75%, 80%, 85%, 90%, 95% or 

98% sequence identity with SEQ ID Nos. 1 or 2 or 35 and 
preferably encode or are complementary to a sequence 
which encodes a nystatin PKS enzyme or part thereof. 
Nucleotide" sequences meeting the % sequence identity 

30 criteria defined herein may be regarded as 
"substantially identical" sequences. 

Where the nucleic acid molecules of the invention 
are defined by reference to SEQ ID Nos. 1 and/or 2, the 
nucleic acid molecule of the invention may thus comprise 

35 a SEQ ID No. 1 or "SEQ ID No. 1-variant" sequence (i.e. 
a sequence complementary, or degenerate to SEQ ID No. 1 
or a functionally equivalent variant such as a 
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hybridising or substantially identical sequence as 
defined above) or a SEQ ID No. 2 or "SEQ ID NO. 2- 
variant" sequence or both. Nucleic acid molecules 
comprising both a SEQ ID No. 1/SEQ ID NO. 1 -variant 
5 sequence and a SEQ ID No. 2 /SEQ ID NO. 2 -variant 
sequence are preferred. 

As referred to herein "functionally equivalent 
variants" or "functional equivalents" retain at least 
one function of the entity to which they are related (or 

10 from which they are derived), e.g. encode a protein with 
substantially the same properties, or exhibit 
substantially the same regulatory or other functional 
properties or activities. 

As mentioned above, nucleic acid molecules 

15 comprising parts or portions (e.g. fragments) of the 

nucleotide sequences of SEQ ID No. 35 or of SEQ ID No. 1 
and/or 2 (or their complementary, degenerate or 
functionally equivalent variants) are also included 
within the scope of the invention. 

2 0 Such parts or portions of the PKS gene cluster 

advantageously may also be regarded as functional 
equivalents of the complete sequence. For example, the 
sequence portion or fragment may retain a functional 
activity as defined above, e.g. an enzymatic, 

2 5 regulatory, or transporter activity in polyketide or 

macrolide biosynthesis. 

Conveniently, the part or portion of the PKS gene 
cluster (e.g. of SEQ ID No. 35, or 1 and/or 2) is at 
least 15 bases in length, more preferably at least 20, 
30 25, 30, 35, 40, 50, 70, 100, 200, 300, 400, 500, 1000, 
2000, 5000, 10,000, 15,000, 20,000, 30,000, or 50,000 
bases. Representative fragment lengths thus include 15- 
50,000 bases e.g. 50-30,000 bases, or 100-20,000, 100- 
10,000 or 200-5,000, or 200.-2,000. The part or portion 

3 5 may comprise or encode contiguous or non- contiguous 

nucleotide or amino acids. 

Parts or portions of functional parts of the PKS 
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gene sequences are discussed in more detail below. 

Parts or portions of the PKS gene cluster may also 
comprise the non-coding or non- functional part of the 
DNA molecule (or the nucleotide sequences) , for example 
5 promoter or operator sequences, or linker sequences 

joining individual genes, modules or enzymatic domains. 
These may be contiguous or non- contiguous and will be 
discussed further below. 

As mentioned above, a number of genes and ORFS 
10 within SEQ ID Nos. 35, 1 and 2 have been identified and 
such genes (or their complementary, degenerate or 
functionally equivalent variants as defined above) 
represent preferred "parts" or fragments of SEQ ID Nos. 
35, 1 and 2. These are tabulated in Table 1 below: 

15 

Table 1 

Molecule features of SEQ ID No s. 35. 1 and 2 (the whole 
gene cluster sequence (125401 bp) Nys 1 (65140 bp) and 
20 Nys 2 (27541bo) respectively) 



SEQ ID No. 35 



25 Start End Gene Description 



putative 4 ' -phsphopanthe- 
teine transferase 
ABC transporter 
ABC transporter 
dGDP-mannose-4 , 6 -dehydratase 
homo log 

Nysl PKS, modules 9-14 
Nys J PKS, modules 15-17 
NysK PKS (module 18 + TE) 
P4 50 monnoxygenase NysL 
ferredoxin NysM 



1191 

3092 
30 4824 
5122 

6338 
34792 
35 51155 
57503 
58980 



454 C 

1275 C 
3070 C 
6156 

34771 
51097 
57355 
58685 
58788 C 



nysF 

nysG 
nysH 
nysDIII 

nysl 
nysJ 
nysK 
nysL 
nysM 
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60241 


59047 C 


nysN 


P450 monooxygenase NysN 




61296 


60240 C 


nysDII 


putative aminotransferase 




62837 


61317 C 


nysDI 


putative UDP- 
glucoronosyl transferase 


5 


63067 


67167 


nysA 


NysA PKS, loading module 




67213 


76791 


nysB 


NysB PKS, modules 1 and 2 




76811 


110101 


nysC 


NysC PKS, modules 3-8 




110521 


111276 


nysE 


putative thioesterase 




111666 


114566 


nysRI 


transcriptional activator 


10 


114590 


117451 


nysRII 


putative transcriptional 
activator 




117441 


120224 


nysRIII 


putative transcriptional 
activator 




120676 


121308. 


nysRIV 


putative response regulat< 


15 






(short) 






120628 


121308 


nysRIV 
(long) 


putative response regulate 




121997 


122758 


nysRV 


putative repressor 




123876 


122812 C 


ORF2 


putative transcriptional 


20 








regulator 




124026 


125222 


ORF1 


putative peptidase 



( ami nohydr o 1 a s e ) 

Note: "C" indicates that the gene is encoded by the 
2 5 complement DNA strand. 





nys 1 


(SEQ ID 


No, 


. 1) 




30 


Start 


End 




Gene 


Description 












Name 




1035 


3 


C 


nysD2 


putative aminotransferase 




2447 


1058 


C 


nysDI 


putative UDP-glucuronosyl- 


35 










transf erase 




2806 


6904 




nysA 


nystatin PKS, loading modul< 




6952 


16528 




nysB 


nystatin PKS, modules 1 and 
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16550 


49838 


nysC 


nystatin PKS, modules 3-8 




50227 


51013 


nysE 


putative thioest erase 




51405 


54303 


nysRl 


putative transcriptional 










activator 1 


5 


54329 


57188 


nysR2 


putative transcriptional 










activator 2 




57180 


59961 


nysR3 


putative transcriptional 










regulator 3 




60367 


61045 


nysR4 


putative response regulator 


0 


61736 


62495 


nysRS 


putative repressor 




63615 


62553 


C ORF2 


putative transcriptional 










regulator 




63765 


64959 


" ORF1 


putative peptidase 


5 








(aminohydrolase) 




nys 2 


(SEQ ID 


No. 2) 






Start 


End 


Gene 


Description 


0 








Name 




1191 


456 C 


nysF 


putative 4 ' -phosphopanthet' 










transferase 




3092 


1277 C 


nysG 


putative ABC transporter 


5 


4824 


3072 C 


nysH 


putative ABC transporter 




5122 


6154 


nysD3 


putative GDP-mannose-4 , 6- 



dehydratase 

6338 27541 nysT nystatin PKS, modules 

9-13 (incomplete) 



0 

"C" in the table' above refers to complementary strands. 

It will be appreciated that nysD2 , Dl, A, B , C, E, 
Rl to R5, ORF1, ORF2 of SEQ ID No . 1 and nysF, G, H and 
D3, and the partial nysl sequences of SEQ ID No. 2, 
5 correspond to their named counterparts in SEQ ID No. 35, 
which represents the whole complete coding sequence for 
the gene cluster, and comprises the nucleotide sequences 
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of SEQ ID Nos. 1 and 2. There is however a difference 
in the nucleotide numbering as between the corresponding 
features in SEQ ID Nos. 2 and 35; for SEQ ID Nos. 1 and 
2, the first nucleotide in the stop codon is recognised 
5 as the end of the gene, whereas for SEQ ID No. 35, the 
third nucleotide of the stop codon is recognised as the 
end of the gene. nysDl , D2 and D3 are also known as 
NysDI, DII and Dili, and nysRl to R5 are also known as 
nysRI to RV. 

10 as regards SEQ ID No. 1 {nysl) and the gene 

sequence nysRI, Table 1 shows this to comprise 
nucleotides 51405 to 54303. Further sequence analysis 
has revealed however that there are in fact two start 
codons, nucleotides 51405-51407 encoding GTG and 

15 nucleotides 51408-51410 encoding ATG. Accordingly, 
nucleotide 51408 of SEQ ID No. 1 represents an 
alternative start nucleotide for nysRI. ATG is 
preferred as a start codon to GTG, and consequently 
514 08 is regarded the start nucleotide in future 

2 0 references to nysRI . The start of nysRI in SEQ ID No. 

3 5 is indicated in Table 1 above as the ATG codon, and 
the translation product of nysRI shown in SEQ ID No. 9 
below is deduced from nucleotides 51408-54303 of SEQ ID 
No. 1. [In an alternative presentation of the NysRI 

25 translation product, wherein nucleotide 514 05 of SEQ ID 
No. 1 is the start nucleotide, SEQ ID No. 9 is modified 
by the inclusion of an additional "first" amino acid, 
V] . 

As regards SEQ ID Nos. 35 and 1, and the gene 
30 sequence nysRIV, Table 1 shows this to comprise 

nucleotides 120676 to 121308 in SEQ ID No. 35 (nysRIV 
short) . Further sequence analysis has revealed however 
that there are in fact two start codons, nucleotides 
120676-120678 encoding GTG and nucleotides 120628-120630 
35 encoding GTG. These start codons correspond to start 
codons in SEQ ID No. 1 for nysR4 at nucleotides 60367- 
60369 (as stated in Table 1) and 60415-60417. The 
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upstream GTG (120628-120630 in SEQ ID No. 35, which 
corresponds to 60367-60369 in SEQ ID No. 1) is preferred 
as a start codon (see Example 6) , and consequently 
120628 is regarded the start nucleotide in future 
5 references to nysRIV. The start of nysRIV in SEQ ID No. 
35 is indicated in Table 1 above (nysRIV short) as the 
downstream start codon 120676-120678, and the deduced 
translation product of nysRIV named herein as "NysRIV 
short" is shown in SEQ ID No. 12. Table 1 also shows 

10 .the alternative start codon of nysRIV (nysRIV long) in 

SEQ ID No. 35 as the upstream start codon 120628-120630. 
Thus a preferred alternative presentation of the NysRIV 
translation product, named herein "NysRIV (long)", is 
shown in SEQ ID No. 43. 

15 Alternative representative parts of the SEQ ID No. 

35, or 1 and/or 2 sequences include the nucleotide 
sequences between the respective "start" and "end" 
nucleotide positions, either individually or 
collectively . 

20 The translation products of the respective "genes" 

have been deduced and the amino acid sequences are set 
out in the following SEQ ID Nos. shown below: 



Gene Product SEP ID No. 

25 NysD2 (NysDI I) (partial) 3 

NysDl (nysDI) 4 
NysA 5 
NysB 6 
NysC 7 

3 0 NysE 8 

NysRl (NysRI ) t 9 

NysR2 (NysRI I ) 10 

NysR3 (NysRI II) 11 

NysR4 (NysRIV) (short) 12 

35 NysR4 (NysRIV) (long) 43 

NysR5 (NysRV) 13 

ORF2 14 
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ORF1 15 

NysF 16 

NysG 17 

NysH 18 

5 NysD3 (NysDIII) 19 

Nysl 2 0 

NysDII (complete) 36 

Nysl (complete) 37 

NysJ 38 

10 NysK 3 9 

NysL 4 0 

NysM 41 

NysN 42 



15 SEQ ID Nos. 4, 10, 12, 13, 14, 16, 18, 41 and 43 

show valine (V) as the first amino acid. According to 
practice in this field and conceptual translation of 
bacterial' DNA, the first amino acid of a protein 
(translation product) is always methionine (M) , 

2 0 regardless of the start codon. Accordingly, translation 

products and amino acid sequences of the present 
invention include not only SEQ ID Nos. 4, 10, 12, 13, 
14, 16, 18, 41 and 43 as presented but also 
modifications of the aforesaid sequences in which the 
25 first V is replaced with M. References to SEQ ID Nos. 
4, 10, 12, 13, 14, 16, 18, 41 and 43 below will be 
understood to include not only the sequences as 
presented, but also the said sequences wherein the first 
V is replaced with M. 

3 0 Viewed from an alternative aspect, the present 

invention also provides a nucleic acid molecule 
comprising a nucleotide sequence encoding one or more 
amino acid sequences selected from SEQ ID Nos 3 to 20 or 
36 to 43, or a nucleotide sequence which is 
3 5 complementary thereto or degenerate therewith. 

Also provided are nucleic acid molecules comprising 
nucleotide sequences encoding one or more amino acid 
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sequences (i.e. polypeptides) which exhibit at least 60% 
sequence identity with any one of SEQ ID Nos . 3 to 20 or 
36 to 43. 

A further aspect of the invention provides a 
polypeptide encoded by a nucleic acid molecule of the 
invention as defined herein. 

More particularly, this aspect of the invention 
provides a polypeptide comprising: 

(a) all or part of an amino acid sequence as shown in 
any one or more of SEQ ID Nos. 3 to. 20 or 36 to 43; or 

(b) all or part of an amino acid sequence which has at 
least 60% sequence identity with any one or more of SEQ 
ID Nos. 3 to 20 or 36 to 43. 

In particular the amino acid sequence may exhibit 

15 at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or 98% 
identity with the polypeptide of any one of SEQ ID Nos. 
3 to 20 or 36 to 43. Alternatively, the amino acid 
sequence may exhibit at least 70%, 75%, 80%, 85%, 90%, 
95% or 98% similarity with the polypeptide of any one of 

20 SEQ ID Nos. 3 to 20 or 36 to 43. Amino acid 

(polypeptide) sequences meeting the % sequence identity 
or similarity criteria herein are regarded as 
"substantially identical". The polypeptide of the 
invention may be an isolated, purified or synthesized 

25 polypeptide. The term "polypeptide" is used herein to 
include any amino acid sequence of two or more amino 
acids i.e. both short peptides and longer lengths (i.e. 
polypeptides) are included. 

Amino acid sequence identity or similarity may be 

3 0 determined using the BestFit program of the Genetics 
Computer Group (GCG) Version 10 Software package from 
the University of Wisconsin. The program uses the local 
homology algorithm of Smith and Waterman with the 
default values: Gap creation penalty = 8, Gap extension 
3 5 penalty = 2, Average match = 2.912, Average mismatch = - 
2 .003 . 

A "part" of the amino acid sequence of any one of 
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SEQ ID No's. 3 to 20 or 36 to 43 (or of a "substantially 
identical" sequence as defined above) may comprise at 
least 20 contiguous amino acids, preferably at least 30, 
40, 50, 70, 100, 150, 200, 300, 400, 500, 1,000, 2,000, 
5 5,000 or 10,000 contiguous amino acids. 

The polypeptide, and preferably also the part 
thereof, is functionally active according to the 
definitions given above, e.g. is enzymatically active or 
has a regulatory or transport functional activity. The 
10 part may not itself be functionally active but may in 
some instances provide regions with functional 
properties of the whole, e.g. represent the active site 
or co- factor binding site required for enzymatic 
activity. 

15 The studies described in the Examples below have 

characterised the nucleotide and polypeptide sequences 
of the invention and various functional regions within 
them have been identified. Such functional regions form 
separate aspects of the invention. For the various 

20 translation products (i.e. gene products of SEQ ID Nos. 
3 to 20 or 36 to 43) , these functional regions are 
summarised below in Table 2 below: 

Table 2 : Molecule Features of Tran slation Products of 
2 5 SEQ ID Nos. 3 to 20 and 36 to 43 



(i) SEQ ID No. : 5 

Translation Product Name : NysA 





Start 


End 


Name 


Description 






AA 


AA 










8 


430 


KS 3 


KS domain, loading module 




35 


528 


841 


AT 


AT domain, loading module 






855 


1055 


DH 


DH domain, loading module 






1285 


1359 


ACP 


ACP domain, loading module 


a 



WO 01/59126 



PCT/GB01/00509 
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Translation Product Name: NysB 



5 


Start 


End 


Name 


Description 




42 


462 


KS1 


KS domain, module 1 




578 


897 


ATI 


AT domain, module 1 










(mMCoA- speci f ic) 


10 


911 


1110 


DH1 


DH domain, module 1 










(inactive) 




1201 


1447 


KR1 


KR domain, module 1 




1484 


1559 


ACPI 


ACP domain, module 1 




1579 


2004 


KS2 


KS domain, module 2 


15 


2117 


2439 


AT2 


AT domain, module 2 










(mMCoA- specific) 




2453 


2659 


DH2 


DH domain, module 2 (inactive) 




2749 


2996 


KR2 


KR domain, module 2 




3025 


3102 


ACP2 


ACP domain, module 2 


20 












(iii) 


SEQ ID No. : 7 








Translation Product Name: NysC 


25 . 


Start 


End 


Name 


Description 




35 


455 


KS3 


KS domain, module 3 




546 


858 


AT 3 


AT domain, module 3 




872 


1073 


DH3 


DH domain, module 3 


30 


1381 


1628 


KR3 


KR domain, module 3 




1662 


1735 


ACP3 


ACP domain, module 3 




1757 


2180 


KS4 


KS domain, module 4 




2291 


2603 


AT4 


AT domain, module 4 




2617 


2818 


DH4 


DH domain, module 4 


35 


3124 


3371 


KR4 


KR domain, module 4 




3407 


3480 


ACP4 


ACP domain, module 4 




3501 


3 924 


KS5 


KS domain, module 5 
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4032 


4346 


ATS 


AT domain, 


module 5 


4360 


4561 


DH5 


DH domain, 


module 5 


4953 


5239 


ER5 


ER domain, 


module 5 


5248 


5495 


KR5 


KR domain, 


module 5 


5528 


5601 


ACP5 


ACP domain, 


module 5 


5623 


6046 


KS6 


KS domain, 


module 6 


6166 


6478 


AT6 


AT domain, 


module 6 


6492 


6704 


DH6 


DH domain, 


module 6 


7038 


7281 


KR6 


KR domain, 


module 6 


7315 


7388 


ACP6 


ACP domain, 


module 6 


7408 


7831 


KS7 


KS domain. 


module 7 


7939 


8253 


AT 7 


AT domain, 


module 7 


8267 


8470 


DH7 


DH domain, 


module 7 


8812 


9086 


KR7 


KR domain, 


module 7 


9120 


9193 


ACP7 


ACP domain , 


, module 7 


9214 


9637 


KS8 


KS domain, 


module 8 


9758 


10072 


ATS 


AT domain, 


module 8 


10086 


10289 


DH8 


DH domain, 


module 8 


10657 


10904 


KR8 


KR domain, 


module 8 


10939 


11012 


ACP8 


ACP domain, 


, module 8 



(iv) SEQ ID No. : 9 

Translation Product Name: Nys Rl 



Start End Name Description 

42 49 P-loop ATP/GTP binding site motif A 

904 932 HTH LuxR-type helix-turn-helix 

motif (DNA binding) 

(N.B. In the alternative representation of NysRl above, 
where nt 51405 of SEQ ID No. 1 is regarded as the start 
codon, these start and amino acid end numbers would each 
increase by 1 . ) 
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(v) SEQ ID No. : 10 

Translation Product Name: NysR2 



Start End Name Description 

902 930 HTH LuxR-type helix- turn-helix 

motif (DNA binding) 



(vi) SEQ ID No. : 11 

Translation Product Name: NysR3 



Start End Name Description 



26 


47 


LZ 


548 


568 


TM1 


583 


610 


TM2 


884 


912 


HTH 


(vii) 


SEQ 


ID No 



Leucine zipper motif (DNA binding) 
Transmembrane domain (putative) 
Transmembrane domain (putative) 
LuxR helix-turn-helix motif 



Translation Product Name: NysR4 (short) 



Start End Name Description 



97 104 P-loop 
149 177 HTH 

0 



ATP/GTP binding site motif A 
LuxR helix-turn-helix motif (DNA 
binding) 
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(viii) SEQ ID No. : 13 

Translation Product Name: NysR5 

5 Start End Name Description 

6 4 0 HTH DeoR helix- turn- helix motif (DNA 
binding) 

10 . 

(ix) SEQ ID No. : 14 

Translation Product Name: 0RF2 

Start End Name Description 
186 2 02 HTH AsnC HTH motif signature 

(x) SEQ ID NO. : 17 
Translation Product Name: NysG 

Start End Name Description 

31 313 TM Transmembrane regions 

392 399 P-loop ATP/GTP binding site 

25 496 510 ABC ABC transporters signature 



(xi) SEQ ID No. : 18 

Translation Product Name: NysH 



15 



20 



30 



Start End Name Description 



17 2 88 TM Transmembrane regions 

368 37 5 P-loop ATP/GTP binding motif A 

35 472 486 ABC ABC transporters signature 
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(xii) SEQ ID No. : 20 

Translation Product Name: Nysl (partial) 



5 


Start 


I—> v-. J 


Name 


Description 








34 


44 8 




KS domain, 


module 9 








572 


8 9 0 




AT domain, 


module 9 








,904 




nu Q 


DH domain, 


module 9 






10 


1443 


1 f~ O £ 

1686 


KKy 


KR domain, 


module 9 








1720 


17 93 


ACPy 


ACP domain 


, module 9 








1813 


2236 


KS10 


KS domain, 


module 10 








2346 


2664 


AT10 


AT domain, 


module 10 








2678 


2890 


DH10 


DH domain 


(inactive) , 


module 


1 1 


15 


2983 


3229 


KR10 


KR domain, 


module 10 








3266 


3339 


ACP10 


ACP domain 


, module 10 








3358 


3780 


KS11 


KS domain, 


module 11 








3898 


4217 


AT11 


AT domain, 


module 11 














(mMCoA- specific) 






20 


4231 


4432 


DH11 


DH domain 


( inactive) , 


module 


1 




4523 


4770 


KR11 


KR domain, 


module 11 








4806 


4879 


ACP11 


ACP domain 


, module 11 








4801 


5325 


KS12 


KS domain, 


module 12 








5432 


5754 


ATI 2 


AT domain, 


module 12 






25 


5768 


5977 


DH12 


DH domain 


(inactive) , 


module 


1 




6068 


6315 


KR12 


KR domain, 


module 12 








6348 


6421 


ACP12 


ACP domain 


, module 12 








6454 


6873 


KS13 


KS domain, 


module 13 






30 


(xiii) 


SEQ ID 


NO . : 3 7 











Translation Product Name: Nysl 



Start End Name Description 



34 


448 


KS9 


KS 


domain, 


module 


9 


572 


890 


AT 9 


AT 


domain, 


module 


9 


904 


1123 


DH9 


DH 


domain, 


module 


9 
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10 



1443 


1686 


KR9 


1720 


1793 


ACP9 


1813 


2236 


VC 1 A 


2346 


2664 


AT10 


2678 


2890 


DH10 


2983 


3229 


KR10 


3266 


3336 


ACP10 


3355 


3777 


KS11 


3898 


4217 


AT11 


4231 


4432 


DH11 


4523 


4769 


KR11 


4 806 


4879 


ACP11 


4901 


5325 


KS12 


5432 


5754 


AT12 


5768 


5977 


DH12 


6068 


6315 


KR12 


6348 


6421 


ACP12 


6454 


6873 


KS13 


6973 


7293 


ATI 3 


7307 


7448 


DH13 


7535 


7774 


KR13 


7813 


7886 


ACPI 3 


7908 


8323 


KS14 


8430 


8741 


AT14 


8755 


8962 


DH14 


9050 


9296 


KR14 


9319 


9394 


ACP14 
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KR domain, module 9 

ACP domain, module 9 

KS domain, module 10 

AT domain, module 10 

inactive DH domain, module 10 

KR domain, module 10 

ACP domain, module 10 

KS domain, module 11 

AT domain (methylmalony-CoA- 

specif ic) , module 11 

inactive DH domain, module 11 

KR domain, module 11 

ACP domain, module 11 

KS domain, module 12 

AT domain, module 12 

inactive DH domain, module 12 

KR domain, module 12 

ACP domain, module 12 

KS domain, module 13 

AT domain, module 13 

inactive DH domain, module 13 

inactive KR domain, module 13 

ACP domain, module 13 

KS domain, module 14 

AT domain, module 14 

inactive DH domain, module 14 

KR domain, module 14 

ACP domain, module 14 



3 0 : 

(xiii) SEQ ID No. : 38 

Translation Product Name: NysJ 



Start End Name Description 

35 

41 464 KS15 KS domain, module 15 

578 889 AT15 AT domain, module 15 
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903 


1102 


DH15 


DH domain, module 15 




1446 


1731 


ER15 


ER domain, module 15 




1740 


1988 


KR15 


KR domain, module 15 




2023 


2096 


ACPI 5 


ACP domain, module 15 


5 


2117 


2538 


KS16 


KS domain, module 16 




2635 


2953 


ATI 6 


AT domain, module 16 




2967 


3167 


DH16 


inactive DH domain, module 16 




3257 


3500 


KR16 


KR domain, module 16 




3539 


3612 


ACP16 


ACP domain, module 16 


10 


3634 


4057 


KS17 


KS domain, module 17 




4153 


4472 


ATI 7 


AT domain, module 17 




4486 


4725 


DH17 


inactive DH domain, module 17 




4997 


5245 


KR17 


KR domain, module 17 




5277 


5350 


ACPI 7 


ACP domain, module 17 


15 












(xiii) 


SEQ 


ID No. : 


39 




Translation Product Name: NysK 




Start 


End Name Description 


20 












34 


457 


KS18 


KS domain, module 18 




568 


881 


ATI 8 


AT domain, module 18 




898 


1102 


DH18 


inactive DH domain, module 18 




1416 


1663 


KR18 


KR domain, module 18 


25 


1695 


1769 


ACP18 


ACP domain, module 18 




1849 


2066 


TE 


thioesterase domain 




(xiv) 


SEQ 


ID No. : 


43 


30 




Translation 


Product Name: NysRIV (long) 




Start 


End 


Name 


Description 




31 


85 


PAS 


PAS -like domain 


35 


113 


120 


P-loop 


ATP/GTP binding site motif A 




. 165 


193 


HTH 


LuxR helix-turn-helix motif (DNA 



binding) 
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As referred to in the above "inactive" denotes DH 
domains which lack the conserved amino acid sequence 
representing the active site motif H(X 3 )G(X 4 )P found in 
DH in other PKSs. It will however be appreciated that 
5 these domains may have activity although this is likely 
to be distinct from the activity of DH domains in other 
PKSs . 

It will be seen that SEQ ID Nos . 5 (NysA) , 6 
(NysB) , 7 (NysC) , 20 and 37 (Nysl) , 38 (NysJ) and 39 

10 (NysK) constitute actual "PKS" enzymes, namely enzymes 
involved in polyketide synthesis. These gene products 
contain identifiable enzymatic domains and modules which 
are tabulated in Table 2 above and shown also in Figures 
4, 8 and 9 (see also Example 1 and Table 4, and Example 

15 4 below which describes the DNA sequence analysis of 
nystatin biosynthesis gene cluster in more detail) . 
Such individual domains and molecules, as identified 
herein form separate aspects of the present invention. 

SEQ ID NOs 3 and 26 (NysDII), 4 (NysDI) , 8 (NysE) , 

20 16 (NysF) , 19 (NysDIII), 40 (NysL) , 41 (NysM) and 42 

(NysN) represent other enzymes functional in polyketide 
or macrolide synthesis e.g. in polyketide release from 
PKS, post-translational PKS modification, and polyketide 
modification. SEQ ID NOs 10 to 15 and 43 (NysRI to 

25 NysRV, and 0RF2) respectively represent transcriptional 
regulators, and SEQ ID NOs 17 and 18 (NysG and NysH) 
represent transport proteins which are presumed to be 
involved in polyketide transport from the cell. This is 
also described in more detail in the Examples below. 

30 Such functional proteins represent separate aspects of 
the present invention. Also included are functional 
parts or fragments of such proteins i.e. active parts or 
fragments which retain (i.e. exhibit measurable levels 
of) the biological activity of the parent molecule from 

35 which they are derived (i.e. of the whole protein or 
polypeptide) . 

The nucleotide sequences of the present invention 
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provide important tools and information which can be 
utilised in a number of ways to manipulate nystatin 
biosynthesis, to synthesise new nystatin derivatives or 
novel polyketide or macrolide structures, and to provide 
5 novel or modified PKS systems (by "PKS system" here is 
meant a polyketide synthesis system i.e. a gene cluster 
or protein complex, collection or assembly, which is 
functional in polyketide synthesis, but which is not 
necessarily restricted to PKS enzymes or enzymatic 

10 domains, and which may contain also other functional 

activities, e.g. other enzymatic (e.g. modificatory) or 
transporter or regulatory functional proteins) . 

Thus, for example, the entire nystatin PKS gene 
cluster or PKS synthetic system as provided herein, or a 

15 portion thereof, may be subjected to modification so as 
to modify one or more genes, or one or more modules, or 
enzymatic domains, or functional sequences within it. 
Such modified or derivatised PKS systems may be used to 
synthesize novel or modified polyketide moieties, as 

20 will be described in more detail below. In this 

situation, the nystatin PKS system provided herein, or a 
fragment or portion thereof, may function as an "origin" 
or " template" or "source" system or sequence for 
modification . 

2 5 More particularly, in one such embodiment and as 

further described below, the non- functional parts (e.g. 
non-biologically active parts) of said system may be 
utilised as a "scaffold", and the functional parts (e.g. 
sequences encoding enzymatic portions) may be modified 

3 0 to yield the derivative or modified PKS system. In some 

embodiments only a single selected, or few selected 
functional (e.g. enzymatic) regions may be modified, 
leaving the remaining sequence or structure largely 
intact . 

35 Alternatively, the functional portions may be 

utilised as tools or materials for the modification of 
other "scaffold" structures e.g. individual nystatin 
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genes, modules or domains may be used for introduction 
(e.g. insertion or replacement) into other PKS scaffold 
structures e.g. PKS scaffold systems derived from PKS 
systems for other macrolide antibiotics e.g. 
5 erythromycin, rapamycin etc. 

Included within the scope of the invention are 
synthetic or recombinant polyketide synthase enzymes 
derived from the scaffold encoded by the nystatin gene 
cluster which are modified to include one or more 
10 functional units derived from other modular enzymes. 
Such functional units may encode a catalytic or 
transport protein domain for example a ketoreductase 
domain from a PKS enzyme or an ACP domain from a modular 
hybrid polyketide/peptide synthesising enzyme. Such 
15 domains can be derived from enzyme domain DNA sequences 
from, for example, polyketide synthesising enzymes, 
peptide synthesising enzymes, hybrid peptide polyketide 
synthesising enzymes, fatty acid synthesising enzymes or 
other enzyme domains known in the art. Analogously, 
20 there are included within the scope of the invention, 
synthetic or recombinant polyketide synthase enzymes 
derived from the scaffold encoded by a different 
polyketide synthase gene cluster, or modular enzyme 
encoding gene cluster, which are modified to include one 
2 5 or more functional units derived from the nystatin gene 
cluster . 

Thus, the sequence and activity information 
provided here for the nystatin biosynthesis gene cluster 
may be used to alter existing known gene clusters and 
30 hence the products they produce. In particular, 
selection and incorporation of particular domains 
described herein (or modification of existing sequences) 
into existing PKS gene clusters will allow incorporation 
of particular properties attributable to the nystatin 

35 gene cluster. 

Thus, in a very general sense, the present 
invention provides the use of the nucleic acid molecules 
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of the invention as defined herein in the preparation of 
a modified PKS system, or in the preparation of modified 
polyketide molecules. 

Such novel or modified polyketide or macrolide 
5 molecules form a separate aspect of the present 
invention. 

The nucleotide sequences may be utilised in this 
way according to the present invention in a random or 
directed or designed manner, e.g. to obtain and test a 

10 particular predetermined or pre-designed structure, or 
to create random molecules, for example libraries of 
polyketide structures, e.g. for screening (this is also 
described in more detail below) . 

Whether for modification within the nystatin-PKS 

15 scaffold, or for introduction into an alternative 

scaffold structure, the genes or genetic elements which 
can be modified include not only the actual PKS genes 
(which encode NysA, NysB, NysC, Nysl, NysJ and NysK) or 
the individual molecules or domains thereof, but also 

20 genes encoding other enzymes or functional proteins 
involved in nystatin biosynthesis and transport 
(referred to herein collectively as "PKS genes" or 
"nystatin genes") . 

As regards the actual PKS genes, as will be 

25 described in more detail below, these may be modified to 
change the nature of the loading domain molecule which 
determines the nature of the starter unit, the number of 
modules, the nature of the extender, as well as the 
various dehydratase, reductase and synthase activities 

30 which determine the structure of the polyketide chain. 

Other genes which can be modified include the 
thioesterase gene, (encoding NysE; SEQ ID NO. 8), which 
may be modified to increase the efficiency of the PKS 
system (in the case of a thioesterase having "editing" 

35 activity which clears the inappropriate substrates from 
the PKS) . If the thioesterase simply cleaves the final 
product off the PKS, it can be used for making nystatin 
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derivatives with a smaller macrolactone ring by 
truncating nystatin PKS, and fusing this thioesterase to 
the end of the truncated protein via genetic 
engineering . 

5 Regulatory genes: activators can be overexpressed 

and repressors inactivated in order to boost polyketide 
or antibiotic production. This may be of particular 
importance for the production of new nystatin 
derivatives in recombinant strains, (which may be 

10 produced in very small quantities) . 

The putative 4'-phosphopantetheine (PPT) 
transferase gene (encoding NysF; SEQ ID NO: 16) can be 
overexpressed in order to achieve efficient post- 
translational modification and full functionality of the 

15 PKS. It can also be used for expression of the nystatin 
(or other) PKS in a heterologous host, which lacks the 
specific PPT activity. Such hosts may include E.coli, 
Saccharomyces cereviseae, etc. 

Deoxysugar genes: glycosyltransf erase (encoding 

2 0 NysDI; SEQ ID NO: 4) can be overexpressed in order to 
boost glycosylation of the synthesised molecules e.g. 
novel nystatin derivatives. It can also be modified by 
in vitro mutagenesis in order to increase its specifi- 
city towards the new substrates. Inactivation of this 

2 5 glycosylstranf erase will result in a recombinant strain 

producing non-glycosylated nystatin (probably also 
lacking some modifications) which can be used, for 
example, for chemical modifications, or enzymatic assays 
for screening new modification activities. 

3 0 Aminotransferase (NysDI I; SEQ ID Nos : 3 and 36) may 

be inactivated to give a nystatin derivative. This 
enzyme is presumed to attach the amino group on the 
deoxysugar mycosamine. This gene may also be expressed 
in other streptomycetes in order to achieve the same 
35 reaction with another deoxysugar normally lacking an 
amino group. 

ABC transporters (e.g. NysH and NysG; SEQ ID NOs 17 
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and 18) : can be overexpressed in order to make the 
efflux of nystatin and its derivatives more efficient. 
They may also be mutated in order to shift their 
specificity towards different compounds. They may be 
inactivated, if it is desired for any reason to 
accumulate the nystatin or its derivatives inside the 

cell. 

The genes encoding monooxygenases NysL and NysN 
(SEQ ID Nos. 40 and 42) can be inactivated in S. noursei 
in order to obtain non- hydroxy 1 at ed and non-oxidized 
nystatin derivatives. Alternatively, they can be 
mutated with the aim of changing their specificities 
toward nystatin precursors. Overexpression of nysL and 
nysN may potentially lead to increased yield of nystatin 
or its derivatives if the hydroxylation and/or oxidation 
steps are limiting in the nystatin biosynthetic pathway. 
Genetic manipulations with nysM encoding ferredoxin (SEQ 
ID No. 41) might also be useful if one wishes to 
establish an in vitro P4 50 hydroxylase system for 
modifications of nystatin precursors. 

Thus, in addition to modification of the nystatin 
PKS system, or modification of other PKS systems by 
using the "nystatin" genes, the nucleic acid molecules 
of the invention can also be utilised to manipulate or 
facilitate the biosynthetic process, for example by 
extending the host range or increasing yield or 
production efficiency etc. 

In order to enable practice of the invention 
according to the principles above, the invention also 
provides an expression vector, and host cells containing 
a nucleic acid molecule as herein defined. 

Also provided are methods for production of a 
polyketide or macrolide molecule (e.g. nystatin or a 
nystatin derivative) , comprising expressing within a 
host cell, a nucleic acid molecule as defined above. 
The polyketide or macrolide molecule produced within the 
host cell or secreted or exported by the host -cell as a 
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result of expression of the nucleic acid molecule (i.e. 
expression of the introduced "PKS synthesis machinery 1 ') 
may then be recovered. 

This method of the invention may thus involve 
5 growing or cultivating the host cell under conditions 
whereby the nucleic acid molecule is expressed, and 
allowing the expression product (s) of the nucleic acid 
molecule to synthesise the polyketide/macrolide 
molecule, or in other words, growing or cultivating the 

10 host cell under conditions wherein, the polyketide or 
macrolide is produced. 

Also provided are methods for preparing recombinant 
nucleic acid molecules according to the invention, 
comprising inserting the nucleic acid molecules 

15 containing the nucleotide sequences of the invention 
into another nucleic acid molecule, e.g. into vector 
nucleic acid, e.g. vector DMA. 

Expression vectors of the invention may include 
appropriate control sequences such as for example 

20 translational (e.g. start and stop codons, ribosomal 
binding sites) and transcriptional control elements 
(e.g. promoter-operator regions, termination stop 
sequences) linked in matching reading frame with the 
nucleic acid molecules of the invention. 

25 Vectors according to the invention may include 

plasmids and viruses (including both bacteriophage and 
eukaryotic viruses) according to techniques well known 
and documented in the art, and may be expressed in a 
variety of different expression systems, also well known 

30 and documented in the art. 

A variety of techniques are known and may be used 
to introduce such vectors into prokaryotic or eukaryotic 
cells for expression, or into germ line or somatic cells 
to form transgenic animals. Suitable transformation or 

35 transfection techniques are well described in the 
literature . 

The invention also includes transformed or 
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transfected prokaryotic or eukaryotic host cells, for 
transgenic organisms containing a nucleic acid molecule 
according to the invention as defined above. Such host 
cells may for example include prokaryotic cells such as 
5 E.coli, Streptomyces and other bacteria, eukaryotic 
cells such as yeasts or the baculovirus- insect cell 
system, transformed mammalian cells and transgenic 
animals and plants. 

The nucleic acid molecules contained in the 

10 expression vectors, and host cells and organisms etc. 

above may also be, as will be described in more detail 
below, derivative nucleic acid molecules, derived from 
the nucleic acid molecules defined above, either by 
modification or by introducing said molecules or parts 

15 thereof into, or combining with, other nucleic acid 
molecules . 

Thus, in one aspect, the invention provides 
recombinant materials for the production of 
combinatorial libraries of polyketides wherein the 

20 polyketide members of the library are synthesized by 
modified PKS systems derived from the naturally 
occurring nystatin Al system provided herein by using 
this system as a scaffold. Generally, many members of 
these libraries may themselves be novel compounds, and 

25 the invention further includes novel polyketide members 
of these libraries. The invention methods may thus be 
directed to the preparation of an individual polyketide. 
The polyketide may or may not be novel, but the method 
of preparation permits a more convenient method of 

30 preparing it. The resulting polyketides may be further 
modified to convert them to antibiotics, typically 
through glycosylation. The invention also includes 
methods to recover novel polyketides with desired 
binding activities by screening the libraries of the 

3 5 invention. 

Thus, in one aspect, the invention is directed to a 
method of preparing a nucleic acid molecule which 
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contains or comprises a nucleotide sequence encoding a 
modified polyketide synthase enzyme or enzyme system. 
The method comprises using the nystatin PKS encoding 
sequence as provided herein as a scaffold and modifying 
5 the portions of the nucleotide sequence that encode 
enzymatic or other functional activities e.g. by 
mutagenesis, inactivation (e.g. by deletion or 
insertion), or replacement. The thus modified 
nucleotide sequence encoding a modified PKS can then be 

10 used to modify a suitable host cell and the cell thus 

modified employed to produce a polyketide different from 
that produced by the native nystatin PKS, whose 
scaffolding has been used to support modifications of 
enzymatic activity. 

15 Alternatively, one or more portions of the 

nucleotide sequence that encode enzymatic or other 
functional activities may be introduced into an 
alternative (i.e. different "second") PKS scaffold (i.e. 
a scaffold derived from a further "second" PKS system, 

2 0 different from the nystatin PKS system) . 

The invention is also directed to polyketides thus 
produced and the antibiotics to which they may then be 
converted. 

In another aspect, the invention is directed to a 
25 multiplicity of cell colonies comprising a library of 

colonies wherein each colony of the library contains an 
expression vector for the production of a different 
modular PKS, but derived from the nystatin PKS of the 
invention, as defined above. The library of different 

3 0 modular PKS may be obtained by modifying one or more of 

the regions of a naturally occurring "nystatin" gene or 
gene cluster encoding an enzymatic activity so as to 
alter that activity, leaving intact the scaffold 
portions of the naturally occurring gene. 
35 In another aspect, the invention is directed to a 

multiplicity of cell colonies comprising a library of 
colonies wherein each colony of the library contains a 
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different modular PKS derived from the nystatin PKS of 
the invention. The invention is also directed to 
methods to produce libraries of PKS complexes and to 
produce libraries of polyketides by culturing these 
5 colonies, as well as to the libraries so produced. In 

addition, the invention is directed to methods to screen 
the resulting polyketide libraries and to novel 
polyketides contained therein. 

As mentioned above, a structural and functional 
10 sequence analysis of the nystatin PKS gene 

cluster/system is presented in the Examples below and 
the DNA sequences are shown in SEQ ID NO 1, 2 and 35, 
and further analysed in Tables 1 and 2 above, SEQ ID NOs 
3 to 20, and 36 to 42 and in Figures 2, 3, 7, 8 and 9. 
15 The modular and "domain" encoding structure of the "PKS" 
gene may be seen. A module may typically contain a 
ketosynthase (KS) , an acyltransf erase (AT) and an acyl 
carrier protein (ACP) . These three functions are 
sufficient to activate an extender unit and attach it to 
20 the remainder of the growing molecule. Additional 

activities that may be included in a module relate to 
reactions other than the Claisen condensation, and 
include a dehydratase activity (DH) , an enoylreductase 
activity (ER) and a ketoreductase activity (KR) . The 
2 5 loading module catalyses the initial condensation, i.e. 

it begins with a "loading domain" represented by AT and 
ACP, which determine the nature of the starter unit. 
The "finishing" of the molecule is believed to be 
regulated by thioesterase activity (TE) and it is 
30 believed that this is achieved by the TE activity 

embedded in NysK. This thioesterase appears to catalyse 
cyclization of the macrolide ring thereby increasing the 
yield of the polyketide product. The NysE TE activity 
is believed to be an "editing" one, and participates in 
35 cleaving off certain substrates from the nystatin PKS 
complex. 

It will be seen from the sequences, Figures and 
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Tables above and below, that the regions in the genes 
and modules that encode enzymatic activities are 
separated by linker or "scaf fold" -encoding regions. 
These scaffold regions encode amino acid sequences that 
5 space the enzymatic/functional activities at the 

appropriate distances and in the correct order. Thus, 
these linker regions collectively can be considered to 
encode a "scaffold" into which the various activities 
are placed in a particular order and spatial 

10 arrangement. It should however be noted that in some 
instances regions of the scaffold may be deleted or 
modified without significantly affecting the activity of 
the resultant" PKS . Indeed the sequence encoding some 
domains with functional activity may be fully or 

15 partially deleted without significant effects. Thus as 
used herein "scaffold" refers to portions of PKS cluster 
not directly attributable to functional e.g. enzymatic 
activities of said PKS, but responsible for maintenance 
of its overall activity, e.g. providing correct spatial 

20 orientation or structure. Said scaffold may comprise 
all linker and non-functional regions of the PKS or a 
functionally active (i.e. retaining structural 
integrity) part thereof. This organization is similar 
in other naturally occurring modular PKS gene clusters. 

25 The invention provides libraries or individual 

modified forms, ultimately of polyketides, by generating 
modifications in the nystatin PKS gene cluster so that 
the protein complexes produced by the cluster have 
altered activities in one or more respects, and thus 

30 produce polyketides other than the natural product of 

the PKS (i.e. nystatin Al) . Novel polyketides may thus 
be prepared, or polyketides in general prepared more 
readily, using this method. By providing a large number 
of different genes or gene clusters derived from the 

35 naturally occurring nystatin PKS gene cluster, each of 
which has been modified in a different way from the 
native cluster, an effectively combinatorial library of 



WO 01/59126 



PCT/GB01/00509 



- 38 - 

polyketides can be produced as a result of the multiple 
variations in these activities. Alternatively the 
nystatin PKS "functional regions" (e.g. genes, modules 
or domains) may be used, for introduction into a 
5 "scaffold" obtained from another naturally occurring PKS 
system. The modified PKS encoding sequences and systems 
used in the present invention thus represent modular 
polyketide synthases "derived from" a naturally 
occurring nystatin PKS. 

10 By a modular PKS "derived from" the nystatin PKS is 

meant a modular polyketide synthase (or its 
corresponding encoding gene(s)) that retains the 
scaffolding of all of the utilized portion of the 
naturally occurring gene. (Not all modules or genes 

15 need be included in the constructs) . On the constant 

scaffold, at least one enzymatic or functional activity 
is mutated, deleted or replaced, so as to alter the 
activity. Alteration results when these activities are 
deleted or are replaced by a different version of the 

20 activity, or simply mutated in such a way that a 

polyketide other than the natural product results from 
these collective activities. This occurs because there 
has been a resulting alteration of the starter unit 
and/or extender unit, and/or stereochemistry, and/or 

25 chain length or cyclization and/or reductive or 

dehydration cycle outcome at a corresponding position in 
the product polyketide. Where a deleted activity is 
replaced, the origin of the replacement activity may 
come from a corresponding activity in a different 

30 naturally occurring polyketide synthase or from a 

different region of the same PKS. Alternatively, such a 
"derived" modular PKS may incorporate one or more 
enzymatic or other functional activities (or their 
encoding nucleotide sequences) obtained or derived from 

35 the nystatin PKS described herein, in the scaffolding of 
a second, different modular PKS (or its gene) . 

Modification or manipulation of the modular PKS may 



WO 01/59126 



PCT/GB01/00509 



- 39 - 

involve truncation, e.g. gene or domain or module 
deletion or domain/gene/module swapping, addition or 
inactivation, which may involve insertion or deletion. 
Alternatively, random or directed modifications (i.e. 
5 mutations) may be made in the nucleotide sequence of the 
selected portion (e.g. in a gene/domain/module etc) . 

The derivative may contain preferably at least a 
thioesterase activity from the nystatin PKS gene 
cluster . 

10 Advantageously, a polyketide synthase "derived 

from" the nystatin PKS may contain the scaffolding 
encoded by all or the portion employed of the nystatin 
synthase gene, contains at least two modules that are 
functional, preferably four or more modules and contains 
15 mutations, deletions, or replacements of one or more of 
the activities of these functional modules so that the 
nature of the resulting polyketide is altered. This 
definition applies both at the protein and genetic 
levels. Particular preferred embodiments include those 
2 0 wherein a KS, AT, KR, DH or ER has been inactivated or 
deleted or replaced by a version of the activity from a 
different PKS or from another location within the same 
PKS. Also preferred are derivatives where at least one 
noncondensation cycle enzymatic activity (KR, DH or ER) 
25 has been deleted or wherein any of these activities has 
been mutated so as to change the ultimate polyketide 
synthesized. 

Thus, there are five degrees of freedom for 
constructing a polyketide synthase in terms of the 
30 polyketide that will be produced. First, the polyketide 
chain length will be determined by the number of modules 
in the PKS system. Second, the nature of the carbon 
skeleton of the PKS will be determined by the 
specificities of the acyl transferases which determine 
35 the nature of the extender units at each position -- 
e.g. malonyl, methyl malonyl, or ethyl malonyl, etc. 
Third, the loading domain specificity will also have an 
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effect on the resulting carbon skeleton of the 
polyketide. Thus, the loading domain may use a 
different starter unit, such as acetyl, propionyl, and 
the like. Fourth, the oxidation state at various 
5 positions of the polyketide will be determined by the 

dehydratase and reductase portions of the modules. This 
will determine the presence and location of ketone, 
alcohol, double bonds or single bonds in the polyketide. 
Finally, the stereochemistry of the resulting 

10 polyketide is a function of three aspects of the 

synthase. The first aspect is related to the AT/KS 
specificity associated with substituted malonyls as 
extender units, which affects stereochemistry only when 
the reductive cycle is missing or when it contains only 

15 a ketoreductase since the dehydratase would abolish 

chirality. Second, the specificity of the ketoreductase 
will determine the chirality of any 3 -OH. Finally, the 
enoyl reductase specificity for substituted malonyls as 
extender units will influence the result when there is a 

20 complete KR/DH/ER available. 

Thus, the modular nystatin PKS system permits a 
wide range of polyketides to be synthesized. As 
compared to the aromatic PKS systems, a wider range of 
starter units including aliphatic monomers (acetyl, 

25 propionyl, butyryl, isovaleryl, etc.), aromatics 

(aminohydroxybenzoyl) , alicyclics (cyclohexanoyl) , and 
heterocyclics (thiazolyl) are found in various 
macrocyclic polyketides. Recent studies have shown that 
modular PKSs have relaxed specificity for their starter 

30 units. Modular PKSs also exhibit considerable variety 
with regard to the choice of extender units in each 
condensation cycle. The degree of (J-ketoreduction 
following a condensation reaction has also been shown to 
be altered by genetic manipulation (Donadio, S. et al . 

35 Proc. Natl. Acad..Sci. USA (1993) 90:7119-7123). 

Likewise, the size of the polyketide product can be 
varied by designing mutants with the appropriate number 
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of modules (Kao, C. M. et al . J. Am. Chem. Soc . (1994) 
116: 11612-11613). Lastly, these enzymes are 
particularly well-known for generating an impressive 
range of asymmetric centres in their products in a 
5 highly controlled manner. The polyketides and 

antibiotics produced by the methods of the present 
invention are typically single stereoisomeric forms. 
Although the compounds of the invention can occur as 
mixtures of stereoisomers, it is more practical to 

10 generate individual stereoisomers using this system. 
Thus, the combinatorial potential within modular PKS 
pathways based on any naturally occurring modular, such 
" "as 'the nystatin, PKS scaffold is virtually unlimited. 

In general, the polyketide products of the PKS must 

15 be further modified, typically by glycosylation, in 
order to exhibit antibiotic activity. Methods for 
glycosylating the polyketides are generally known in the 
art; the glycosylation may be effected intracellularly 
by providing the appropriate glycosylation enzymes or 

20 may be effected in vitro using chemical synthetic means. 

The macrolide antibiotics, polyketide moieties of 
which are synthesised by modular PKSs, may contain any 
of a number of different deoxysugars. The nystatin 
molecule contains mycosamine deoxysugar moiety. 

25 Deoxysugar biosynthesis starts typically with glucose- 1- 
phosphate and proceeds through the action of dTDP- 
glucose synthase and dTDP-glucose-4 , 6 -dehydratase. The 
product of the latter, typically a dTDP-4 , 6 -keto-6- 
deoxyglucose, is further subjected to at least two of 

30 the following reactions - epimerisation, isomerisation, 
reduction, dehydration, transamination, or methylation - 
to give a dTDP-D-deoxy sugar . The latter is then 
attached to the macrolactone ring via the action of a 
glycosyl transferase, hence providing for the 

3 5 glycosylation of the macrolide compound. 

Glycosylation can also be effected using the non- 
glycosylated macrolides as starting materials, and using 
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mutants of streptomycetes or other organisms (i.e. 
Myxococcus, Pseudomonas, Mycobacterium etc.) that can 
provide the glycosylation activities. Alternatively, 
glycosyltransferase-encoding genes from the organisms 

5 mentioned above can be introduced in S. noursei or other 
organisms containing the native or modified nystatin PKS 
genes or portions thereof in order to provide the 
desired glycosylation. The deoxysugar biosynthesis 
genes from the nystatin gene cluster can be used for 

10 complementation of corresponding activities in different 
PKS producers, as well as for engineering the 
biosynthetic pathways for alternative deoxysugars. 

The derivatives of nystatin PKS can be prepared by 
manipulation of the relevant genes, or by introducing 

15 the nystatin genes or portions thereof into another PKS. 
A large number of modular PKS gene clusters have been 
mapped and/or sequenced, including erythromycin, 
soraphen A, rifamycin, avermectin and rapamycin, which 
have been completely mapped and sequenced, and FK506 and 

2 0 oleandomycin which have been partially sequenced, and 
candicidin, pimaricin and nemadectin which have been 
mapped and partially sequenced. Additional modular PKS 
gene clusters are expected to be available as time 
progresses. These genes can be manipulated using 

25 standard techniques to delete or inactivate activity 
encoding regions, insert regions of genes encoding 
corresponding activities from the same or different PKS 
systems, or otherwise mutate using standard procedures 
for obtaining genetic alterations. Of course, portions 

30 of, or all of, the desired derivative coding sequences 
can be synthesized using standard solid phase synthesis 
methods such as those described by Jaye et at., J. Biol. 
Chem. (1984) 259:6331, and which are available 
commercially from, for example, Applied Biosystems, Inc. 

35 in order to obtain nucleotide sequences encoding a 

variety of derivatives of the naturally occurring PKS, 
and thus a variety of polyketides for construction of a 
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library, a desired number of constructs can be obtained 
by "mixing and matching" enzymatic activity-encoding 
portions, and mutations can be introduced into the 
native host (i.e. nystatin) PKS gene cluster or portions 
thereof . 

Mutations can be made to the native sequences using 
conventional techniques. The substrates for mutation 
can be an entire cluster of genes or only one or two of 
them; the substrate for mutation may also be portions of 
one or more of these genes. Techniques for mutation are 
well known in the art and described in the literature, 
for example in W098/49315 and the references cited 
therein. Such techniques include preparing synthetic 
oligonucleotides including the mutation (s) and inserting 
15 the mutated sequence into the gene encoding a PKS 

subunit using restriction endonuclease digestion. (See, 
e.g. Kunkel, T.A., Proc . Natl. Acad. Sci. USA (1985) 
82:448; Geisselsoder et al . BioTechniques (1987) 5:786.) 
Alternatively, the mutations can be effected using a 
20 mismatched primer (generally 10-20 nucleotides in 
length) which hybridizes to the native nucleotide 
sequence, at a temperature below the melting temperature 
of the mismatched duplex. The primer can be made 
specific by keeping primer length and base composition 
25 within relatively narrow limits and by keeping the 

mutant base centrally located (Zoller and Smith, Methods 
Enzymol. (1983) 100:468). Primer extension is effected 
using DNA polymerase, the product cloned and clones 
containing the mutated DNA, derived by segregation of 
30 the primer extended strand, selected. Selection can be 
accomplished using the mutant primer as a hybridization 
probe. The technique is also applicable for generating 
multiple point mutations. See, e.g. Dalbie-McFarland et 
al. Proc. Natl. Acad. Sci. USA (1982) 79:6409. PCR 
35 mutagenesis will also find use for effecting the desired 
mutations. 

Random mutagenesis of selected portions of the 
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nucleotide sequences encoding enzymatic activities can 
be accomplished by several different techniques known in 
the art, e.g. by inserting an oligonucleotide linker 
randomly into a plasmid, by irradiation with X-rays or 
5 ultraviolet light, by incorporating incorrect 
nucleotides during in vitro DNA synthesis, by 
error-prone PCR mutagenesis, by preparing synthetic 
mutants or by damaging plasmid DNA in vitro with 
chemicals. Chemical mutagens include, for example, 

10 sodium bisulfite, nitrous acid, nitrosoguanidine , 
hydroxyl amine, agents which damage or remove bases 
thereby preventing normal base -pairing such as hydrazine 
or formic acid, analogues of nucleotide precursors such 
as 5-bromouracil , 2-aminopurine, or acridine 

15 intercalating agents such as proflavine, acriflavine, 

quinacrine, and the like. Generally, plasmid DNA or DNA 
fragments are treated with chemicals, transformed into 
E. coli and propagated as a pool or library of mutant 
plasmids . 

20 In addition to providing mutated forms of regions 

encoding enzymatic or other functional activity, regions 
encoding the desired functions or activities may be 
recovered from different locations in the same nystatin 
PKS, for example, using PCR techniques with appropriate 

25 primers. By "corresponding" activity encoding regions 

is meant those regions encoding the same general type of 
activity -- e.g. a ketoreductase activity in one 
location of a gene cluster would "correspond" to a 
ketoreductase-encoding activity in another location in 

30 the gene cluster. 

If replacement of a particular target region in a 
host polyketide synthase is to be made (be this host 
nystatin PKS, or a different PKS into which "nystatin" 
sequences are to be inserted) , this replacement can be 

35 conducted in vitro using suitable restriction enzymes or 
can be effected in vivo using recombinant techniques 
involving homologous sequences framing the replacement 
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gene in a donor plasmid and a receptor region in a 
recipient plasmid, genome or chromosome. Such systems, 
advantageously involving plasmids of differing 
temperature sensitivities are described, for example, in 
5 PCT application WO 96/40968. 

WO 00/77181 describes methods of assembling several 
DNA units in sequence into large DNA constructs which 
are applicable to the recombinant polyketide synthases 
within the scope of the invention. 

10 The vectors used to perform the various operations 

to replace the enzymatic activity in the host PKS genes ' 
or to support mutations in these regions of the host PKS 
genes may be chosen to contain control sequences 
operably linked to the resulting coding sequences in a 

15 manner that expression of the coding sequences may be 

effected in a appropriate host. However, simple cloning 
vectors may be used as well. 

If the cloning vectors employed to obtain PKS genes 
encoding derived PKS lack control sequences for 

20 expression operably linked to the encoding nucleotide 
sequences, the nucleotide sequences are inserted into 
appropriate expression vectors. This need not be done 
individually, but a pool of isolated encoding nucleotide 
sequences can be inserted into host vectors, the 

25 resulting vectors transformed or transfected into host 

cells and the resulting cells plated out into individual 
colonies . 

Suitable control sequences include those which 
function in eucaryotic and prokaryotic host cells. 

30 Preferred hosts include prokaryotic hosts and fungal 

systems such as yeast, but single cell cultures of, for 
example, mammalian cells could also be used. There is 
no particular advantage, however, in using such systems. 
Particularly preferred are yeast and prokaryotic hosts 

35 which use control sequences compatible with Streptomyces 
spp. Suitable control sequences for single cell 
cultures of various types of organisms are well known in 
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the art. Systems for expression in yeast, including 
control sequences which effect secretion are widely 
available and are routinely used. Control elements 
include promoters, optionally containing operator 
5 sequences, and other elements depending on the nature of 
the host, such as ribosome binding sites. Particularly 
useful promoters for prokaryotic hosts include those 
from PKS gene clusters which result in the production of 
polyketides as secondary metabolites, including those 

10 from aromatic (Type II) PKS gene clusters. Examples are 
act promoters, tern promoters, spiramycin promoters, and 
the like. However, other bacterial promoters, such as 
those derived from sugar metabolizing enzymes, such as 
galactose, lactose (tac) and maltose, are also useful. 

15 Additional examples include promoters derived from genes 
encoding biosynthetic enzymes such as tryptophan 
synthase (trp) , the p-lactamase (bla) , bacteriophage 
lambda FL, T5 and T7 . In addition, synthetic promoters, 
such as the tac promoter (U.S. Patent No. 4,551,433), 

20 can be used. 

Other regulatory sequences may also be desirable 
which allow for regulation of expression of the PKS 
replacement sequences relative to the growth of the host 
cell. Regulatory sequences are known to those of skill 

2 5 in the art, and examples include those which cause the 
expression of a gene to be turned on or off in response 
to a chemical or physical stimulus, including the 
presence of a regulatory compound. Other types of 
regulatory elements may also be present in the vector, 

30 for example, enhancer sequences. 

Selectable markers can also be included in the 
recombinant expression vectors. A variety of markers 
are known which are useful in selecting for transformed 
cell lines and generally comprise a gene whose 

35 expression confers a selectable phenotype on transformed 
cells when the cells are grown in an appropriate 
selective medium. Such markers include, for example, 
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genes which confer antibiotic resistance or sensitivity 
to the plasmid. Alternatively, several polyketides are 
naturally coloured and this characteristic provides a 
built-in marker for screening cells successfully 
5 transformed by the present constructs. 

The various PKS nucleotide sequences, or a mixture 
of such sequences, can be cloned into one or more 
recombinant vectors as individual cassettes, with 
separate control elements, or under the control of, e.g. 

10 a single promoter. The PKS subunits or mixture of 

components can include flanking restriction sites to 
allow for the easy deletion and insertion of other PKS 
subunits or mixture components so that hybrid PKSs can 
be generated. The design of such unique restriction 

15 sites is known to those of skill in the art and can be 
accomplished using the techniques described above, such 
as site-directed mutagenesis and PCR. 

As described above, particularly useful control 
sequences are those which themselves, or using suitable 

20 regulatory systems, activate expression during 

transition from growth to stationary phase in the 
vegetative mycelium. The system contained in plasmid 
RMS, i.e. the actl/actlll promoter pair and the actll- 
ORF4, an activator gene, is particularly preferred 

25 (McDaniel et al . , Science, v. 262, p 1546-1550, 1993). 

Particularly preferred hosts are those which lack their 
own means for producing polyketides so that a cleaner 
result is obtained. Illustrative host cells of this type 
include the modified S. coeticotor CH999 culture 

30 described in PCT application WO 96/40968 and similar 
strains of S. lividans. 

The expression vectors containing nucleotide 
sequences encoding a modified PKS system or a variety of 
PKS systems for the production of different polyketides 

35 may then be transformed into the appropriate host cells, 
e.g. to construct a library. In one straightforward 
approach, a mixture of such vectors is transformed into 
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the selected host cells and the resulting cells plated 
into individual colonies and selected for successful 
transf ormants . Each individual colony will then 
represent a colony with the ability to produce a 
5 particular PKS synthase and ultimately a particular 

polyketide. Typically, there will be duplications in 
some of the colonies; the subset of the transformed 
colonies that contains a different PKS in each member 
colony can be considered the library. Alternatively, 

10 the expression vectors can be used individually to 
transform hosts, which transformed hosts are then 
assembled into a library. A variety of strategies might 
be devised to obtain a multiplicity of colonies each 
containing a PKS gene cluster derived from the nystatin 

15 host gene cluster so that each colony in the library 
produces a different PKS and ultimately a. different 
polyketide. The number of different polyketides that 
are produced by the library is typically at least three 
(e.g. 2 mutations in the PKS genes which may appear 

20 . separately or in combination) , more typically at least 
ten, and preferably at least 20, more preferably at 
least 50, reflecting similar numbers of different 
altered PKS gene clusters and PKS gene products. The 
number of members in the library is arbitrarily chosen; 

2 5 however, the degrees of freedom outlined above with 

respect to the variation of starter, extender units, 
stereochemistry, oxidation state, and chain length is 
quite large . 

Methods for introducing the recombinant vectors of 

3 0 the present invention into suitable hosts are known to 

those of skill in the art and typically include the use 
of CaCl 2 or other agents, such as divalent cations, 
lipofection, conjugation, protoplast transformation and 
electroporation . 
3 5 A wide variety of hosts can be used, even though 

some hosts natively do not contain the appropriate post- 
translational mechanisms to activate the acyl carrier 
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proteins of the synthases. These hosts can be 
complemented with the appropriate recombinant enzymes, 
for example NysF, to effect these modifications. 

The polyketide producing colonies can be identified 
5 and isolated using known techniques and the produced 
polyketides further characterized. The polyketides 
produced by these colonies can be used collectively in a 
panel to represent a library or may be assessed 
individually for activity. 

10 The libraries can thus be considered at four 

levels: (1) a multiplicity of colonies each with a 
different PKS encoding sequence comprising a different 
PKS cluster but all derived from the nystatin PKS 
cluster; (2) colonies which contain the proteins that 

15 are members of the PKS produced by the coding sequences; 
(3) the polyketides produced; and (4) antibiotics 
derived from the polyketides. 

Colonies in the library can be induced to produce 
the relevant synthases and thus to produce the relevant 

20 polyketides to obtain a library of candidate 

polyketides. The polyketides produced can be screened 
for antimicrobial, antitumour, antihelmintic or 
immunosuppressive activities, as well as for binding to 
desired targets, such as receptors, signalling proteins, 

25 and the like. The supernatants or culture pellets per 
se can be used for screening, or partial or complete 
purification of the polyketides can first be effected. 
Typically, such screening methods involve detecting the 
binding of each member of the library to receptor or 

3 0 other target ligand. Binding can be detected either 
directly or through a competition assay. Means to 
screen such libraries for binding are well known in the 
art . 

Alternatively, individual polyketide members of the 
35 library can be tested against a desired target. In this 
event, screens wherein the biological response of the 
target is measured can more readily be included. 
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A large number of novel polyketides may be prepared 
according to the method of the invention, as illustrated 
by the representative in the Examples below. These 
novel polyketides may be useful intermediates in 
5 formation of compounds with antibiotic activity through 
glycosylation or reactions as described above. As 
indicated above, the individual polyketides may be 
reacted with suitable sugar derivatives to obtain 
compounds of antibiotic activity. Antibiotic activity 

10 can be verified using typical screening assays such as 
those set forth in Lehrer, R. et al . , J. Immunol. Meth. 
(1991) 137:167-173. 

Thus, in a further aspect, the invention provides a 
method of preparing a nucleic acid molecule comprising a 

15 nucleotide sequence encoding a modified nystatin PKS, 
wherein said modified nystatin PKS is derived from a 
nystatin PKS as defined herein (i.e. a naturally 
occurring nystatin PKS) encoded by a nucleotide sequence 
as defined herein) containing first regions which encode 

20 enzymatic or other functional activities and second 

regions which encode scaffolding amino acid sequences, 
said method comprising 

(a) modifying at least one said first region; or 

(b) incorporating at least one said first region 
25 into a scaffolding-encoding second region from a 

different PKS-encoding nucleotide sequence. 

As discussed above in relation to the nucleic acid 
molecules of the invention, the first region may be any 
part (e.g. encoding a domain or a module or a part 

30 thereof) of a nucleotide sequence of the invention (i.e. 
SEQ ID NO 1 or 2 or 35) . 

Also provided are (i) a method of preparing a 
modified nystatin PKS as defined above, said method 
comprising expressing a nucleic acid molecule prepared 

35 as defined above within a host cell (i.e. culturing or 
growing a host cell containing such a nucleic acid 
molecule) under conditions whereby the modified nystatin 
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PKS is expressed), and (ii) a modified nystatin PKS so 
produced or so obtainable. 

New polyketides produced by such a modified PKS are 
also within the scope of the invention, as are new 
5 antibiotics which are generally the glycosylated forms 

of these .polyketides although some have activity without 
glycosylation which may be due to different post- 
translation modifications such as hydroxylation or 
oxidation. 

10 The invention will now be described in more detail 

in the following non-limiting Examples with reference to 
the drawings in which: 

Figure 1 shows the structure of the polyene 
antifungal antibiotic nystatin Al; 
15 Figure 2 presents physical and functional maps of 

the E.coli - Streptomyces shuttle vector pSOKlOl, 
pSOK201 and pSOK804 used in Examples 1, 2 and 3; 

Figure 3 is a schematic representation showing two 
regions of the S. noursei ATCC114 55 genome encoding the 
20 nystatin biosynthesis gene (corresponding to SEQ ID NOs 
1 and 2) . Overlapping recombinant phages containing the 
presented DNA sequences are shown over the regions 
drawings (see Example 1) ; 

Figure 4 is a schematic representation showing the 
25 functional organisation of the nystatin PKS NysA (SEQ ID 
NO 5) , NysB (SEQ ID NO 6), NysC (SEQ ID NO 7) and Nysl 
(SEQ ID NO 20) and their roles in nystatin biosynthesis; 

Figure 5 is a schematic representation showing 
genetic manipulations of the module 5 in NysC PKS 
3 0 leading to production of new polyene compounds by 
recombinant S .noursei strains (see Example 2) ; 

Figure 6 shows the UV spectra (in DMSO) for 
nystatin and new polyene compounds S44 and S4 8 obtained 
from recombinant S. noursei strains with genetically 
35 altered NysC PKS (see Example 2) ; 

Figure 7 is a schematic representation showing the 
region of the S. noursei ATC11455 genome encoding the 
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nystatin biosynthetic gene cluster (corresponding to SEQ 
ID No. 35) . Gene organisation within the gene cluster 
is shown. The inserts from the overlapping recombinant 
phages encompassing the cloned region are shown above 
5 the physical/genetic map. The nys genes are designated 
with capital letters in italic, other ORFs are numbered; 

Figure 8 is a representation showing a proposed 
model for nystatin biosynthesis in S. nourseij and 

Figure 9 is a schematic representation showing the 

10 functional organisation of the nystatin PKS NysA (SEQ ID 
No. 5), NysB (SEQ ID No. 6), NysC (SEQ ID No. 7), Nysl 
(SEQ ID No. 37), NysJ (SEQ ID No. 38) and NysK (SEQ ID 
No. 39) proteins. KS S - ketosynthase with the Cys to Ser 
substitution in active site; KS - ketosynthase; AT - 

15 acetate- specif ic acyltransf erase; mAT - propionate- 
specific acetyltransf erase; DH - dehydratase; DHi - 
inactive dehydratase; ER - enoyl reductase; KR - 
detoreductase; KRi - inactive ketoreductase ; ACP - acyl 
carrier protein. 

20 Figure 10 shows compounds that can be theoretically 

produced from the following manipulations within the 
nystatin gene cluster 

-insertion of ER domain into module 3 (1) 
-insertion of ER domain into module 4 (2) ; 

25 -simultaneous inactivation of the ER domain in module 5 
and insertion of the ER domain into module 3 (3) ; 
-simultaneous inactivation of the ER domain in module 5 
and insertion of the ER domain into module 4 (4) ; 
-simultaneous inactivation of the ER domain in module 5 

3 0 and insertion of the ER domain into module 7 (5) ; 

-simultaneous inactivation of the ER domain in module 5 
and insertion of the ER domain into module 8 (6) ; 
-simultaneous inactivation of the ER domain in module 5 
and insertion of the ER domain into module 9 (7) ; 

35 -simultaneous inactivation of the ER domain in module 5 
and insertion of the ER domains into modules 8 and 9 
(8) ; 
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-simultaneous inactivation of the ER domain in module 5 
and insertion of the ER domains into modules 7 and 8 
(9) . 

Figure 11 shows compounds that can be theoretically 
5 produced from the following manipulations within the 
nystatin gene cluster: 

-replacement of methylmalonyl- specific acetyltransf erase 
(AT) domain in module 11 of the nystatin PKS with 
malonyl -specif ic AT domain (10) ; 

10 -replacement of malonyl -specif ic AT domain in module 12 
with methylmalonyl-specific AT domain with simultaneous 
replacement of methylmalonyl-specific AT domain in 
module 11 with malonyl -specif ic AT domain (11) ; 
-replacement of malonyl- specif ic AT domain in module 10 

15 with methylmalonyl-specific AT domain with simultaneous 
replacement of methylmalonyl-specific AT domain in 
module 11 with malonyl -specif ic AT domain (12); 
-inactivation of P450 monooxygenase- encoding genes nysL 
or nysN (whichever is found to be responsible for 

2 0 oxygenation of the methyl group at C-16 on the nystatin 

molecule) (13) . 

Figure 12 shows compounds that can be theoretically 
produced from the following manipulations within the 
nystatin gene cluster: 
25 -inactivation of dehydratase (DH) domain in module 3 of 
the nystatin PKS (14); 

-inactivation of DH domain in module 4 (15) ; 
-inactivation of DH domain in module 3 with simultaneous 
inactivation of ER domain in module 5 (16) ; 
30 -inactivation of DH domain in module 4 with simultaneous 
inactivation of ER domain in module 5 (17) ; 
-inactivation of DH domain in module 7 with simultaneous 
inactivation of ER domain in module 5 (18) ; 
-inactivation of DH domain in module 8 with simultaneous 

3 5 inactivation of ER domain in module 5 (19) ; 

-inactivation of DH domain in module 9 with simultaneous 
inactivation of ER domain in module 5 (20) . 
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Exam ple 1 - Cloning of the nystati n biosynthesis gene 
cluster 

Bacterial strains, plasmids and growth conditions 

Bacterial strains and plasmids used in this study 

5 are listed in Table 3. New strains and plasmids 

developed in the course of this study are described 
herein and shown in Figure 3. S. noursei ATCC 11455 and 
its mutants were grown on solid ISP2 medium (Difco) , and 
in liquid TSB medium (Oxoid) . Intergeneric conjugation 

0 from E.coli ET12567 (pUZ8002) into Streptomyces strains 
was done as described in Flett et al, FEMS Microbial 
Lett., v. 155: 223-229, (1997), except for the "heat 
shock" time, which was reduced to 5 minutes. Apramycin 
(Fluka) at a concentration 50 mg/ml was used to select 

5 for the S .noursei transconjugants on solid medium. For 
inoculation of the S. noursei ATCC11455 transconjugants 
prior to total DNA isolation, liquid medium TSB 
supplemented with 20 jig/ml apramycin was used. E.coli 
strains were grown and transformed as described in 

0 Sambrook et al, Cold Spring Harbor Laboratory (1989) . 
E.coli ET12567 (pUZ8002) was maintained on the media 
containing 2 0/ig/ml chloramphenicol and 50 tig/ml 
kanamycin . 



Table 3 - Bacterial strains, plasmids and pha ges used in 
this study 



Strain, plasmid 
or phage 



Properties 



Source or 
reference 



E.coli DH5a 



E.coli XLl-Blue 
MRA (P2) 

E.coli ET12567 



S. noursei ATCC 
11455 



general cloning host 



host for the gene 
library 

strain for intergeneric 
conjugation 



wild type, nystatin 
producer 



Sambrook et 
al, 1989, 
supra 

Stratagene 



MacNeil et 
al, Gene, 
v. ill: 61- 
68, 1992 

ATCC 



WO 01/59126 



PCT/GB01/00509 



55 - 



pGEM3Zf (-) 

pGEM72f <-) 

pGEMUZf (-) 

pUZ8002 
pWHM3 



pSET152 



pSOKlOl 



10 



pGMIL 



pSOK201 



Col El replicon, Ap , Promega 
3.2kb 

ColEl replicon, Ap R , 3.0 Promega 
kb 

Col El replicon, Ap R , Promega 
3.2kb 

RK2 derivative, Km*, Tc R D.H.Figurski 



ColEl+pIJlOl replicons, 
Thio R , 7.2kb 



ColEl replicon+0C31 int, 
- oriT, Am R , 5 . 5kb 



pWHM3 derivative in 
which the 3 . lkb 
BairHI/Sphl fragment was 
replaced with the 3.0 kb 
BamKI/SphI fragment from 
pSET152 containing 
ColEl, oriT and Am K , 
7. lkb 

pSG5 replicon, Neo R , 
5.3kb 



pGMll derivative in 
which the 1.2 kb 
EcoRI / Hi ndl I I fragment 
was replaced with the 
3.0 kb EcoRI /Hindi I I 
fragment from pSOKlOl 
containing ColEl, oriT 
and Am R , 7 . 1 kb 



Vara et al, 
J. Bacteriol 

v. 171: 
5872-5881, 
1989 

Bierman et 
- al ,- -Gene, 
v. 116: 43- 
49, 1992 

This work 
(see Figure 
2) 



Wohlleben & 
Muth, In 
" Plasmids, 
a practical 
approach" , 
Ed. Hardy, 
IRL Press, 
p 147-175, 
1993 

This work 
(see Figure 
2) 
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10 



DASH I I 
pSOK804 



pGEM7ermELi 



bacteriophage A vector Stratagene 



ColEI replicon, Am , 
OriT, int WB , AttPwv B 



pGEM7ZF plasmid 
containing PermE* 
promoter 



This work 
(see Figure 
2) and Van 
Mellaert et 

al., 
Microbiol . , 
v. 

144:3351- 
3358, 1998 

C.R. 
Hutchinson 



15 



20 



25 



30 



35 



Am-apramycin, Ap-ampicilli , Neo-neomycin, Thio- 
thiostrepton, Km-kanamycin , Tc- tetracycline 

Analysis of the secondary metabolite production 

Fermentations were performed in Appl icon 3-1 
fermenters containing initially 1.3 1 SAO-23 medium 
containing (gl" 1 ) :glucose-H 2 0, 90; NH 4 N0 3 , 2.5; corn flour, 
3.0; MgS0 4 -7H 2 0, 0.4; KH 2 P0 4 ,02; CaC0 3 , 7; trace element 
solution, 3ml. Trace element solution (gl" 1 ) : FeS0 4 . 7H 2 0, 
5.0; CuS0 4 .5H 2 0, 0.39; ZnS0 4 7H 2 0 / 0.44 ; MnS0 4 H 2 0, 0.15; 
Na 2 Mo0 4 .2H 2 0, 0.01; CoCl 2 .6H 2 0, 0.02 ; HCl, 50. The 
fermentations were performed at 28°C with pH controlled 
at 6.5-7.0 by HCl (2M) and NaOH (2M) . The dissolved 
oxygen was controlled at >4 0% of saturation by the 
agitation (300-900 rpm) and aeration (0.25 wm) . 
Inocula for the fermentations (3 vol-%) with SAO-23* 
medium were grown in TSB-medium (TSB, Oxoid CM129, 3 7 
gl" 1 ) at 28°C in shake flasks (500-ml baffled Erlenmeyer 
flasks with 100 ml medium; 200 rpm) . Each shake flask 
was inoculated with 0.2 ml spore suspension and 
incubated for 18-20 hours. Nystatin production was 
assayed by HPLC of the dimethyl formamide extracts of the 
cultures after fermentations (Raatikainen, J. 
Chromatogr., v. 588, 356-360, (1991)). 



40 
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DNA Manipulation 

Plasmid, phage and total DNA preparations, 
endonuclease digestions, ligations and fractionation 
were performed as described previously (Sambrook et al., 
5 1989, supra; Hopwood et al, 1985, supra) . DNA fragments 
were isolated from agarose gels using the QIAGEN Kit 
(QIAGEN GmbH) , labelled with the use of the digoxygenin 
kit from Boehringer Mannheim and used for Southern blot 
analysis according to the manufacturer's manual. DNA 
10 sequencing was performed at QIAGEN GmbH (Germany) , and 
the data were analysed with the GCG software (Devereux 
et al, Nucleic Acids Res. v. 12: 387-395, 1984). 

PCR-assisted amplification and cloning of a PKS-encoding 
15 DNA fragment from the S.noursei ATCC11455 genome 

In order to obtain the DNA encoding the nystatin 
biosynthesis genes, the S.noursei ATCC1455 gene library 
was probed with labelled PKS-encoding DNA. To obtain 
the DNA probe, degenerate oligonucleotide primers were 

2 0 designed, which correspond to conserved amino acid 

regions within (3-ketoacyl synthase (KS) and acyl carrier 
protein (ACP) domains of known modular PKSs . The 
degenerate primers used for amplification corresponded 
to the conserved amino acid motifs in ACP and KS domains 

2 5 in known PKS, and were designed according to the codon 
usage table for Strep tomyces (Wright & Bibb, Gene, v. 
113: 55-65, 1992). The ACP nucleotide primer (sense) 
corresponded to the motif Glu(Asp)Leu Gly Phe(Leu, Val) 
Asp Ser Leu (SEQ ID NO: 21) and had the sequence 5' - 

30 GAG/C CTG/C GGC/G T/CTG/C GAC TCC/G CTG/C-3 (SEQ ID NO: 
22) . The KS nucleotide primer (antisense) corresponded 
to the motif Val Asp Thr Ala Cys Ser Ser (SEQ ID NO: 23) 
and had the sequence 5' - G/CGA G/CGA G/ACA/ G/CGC C/GGT 
GTC G/CAC-3' (SEQ ID NO: 24). Total DNA isolated from 

35 the S.noursei ATCC11455 was used as a template for 

polymerase chain reaction (PCR) -assisted amplification 
of the DNA fragment from the genome of this organism 
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with the use of KS and ACP oligonucleotide primers. 
From the relative position of the motifs on the modular 
PKSs it was assumed that resulting PCR product would be 
approx 0.7 kb in size. The 50/il PCR mixture contained 
5 o.l /ig of S.noursei ATCC114 55 total DNA, 2 5 pm of each 
ACP and KS oligonucleotide primers dNTPs (final 
concentration 350 /zm) , lxPCR buffer from Expand High 
Fidelity PCR System (Boehringer Mannheim), and 1.5 U of 
the DNA polymerase mixture from the same system. The 

10 PCR was performed on the Perkin Elmer GeneAmp PCR System 
2400 with the following program: 1 cycle of denaturation 
at 96°C (4 min) , 35 cycles of denaturation/annealing/ 
synthesis at 94°C (45 sec) and 70°C (5 min) and 1 cycle 
of final annealing/extension at 72°C (7 min). The 0.7 

15 kb DNA fragment obtained with this procedure was cloned 
in pUC18 vector in E.coli DH5a with the use of Sure 
Clone Ligation Kit (Pharmacia) . One of the resulting 
recombinant plasmids, pPKS72 of 3.5kb # was subjected to 
DNA sequence analysis. 

20 Subsequent cloning in Escherichia coli vector pUC18 

and DNA sequencing of the resulting 0 . 7 kb PCR product, 
followed by conceptual translation and database search, 
confirmed that it encodes part of PKS type I. This DNA 
fragment was used as a probe for screening a S.noursei 

25 ATCC114 55 gene library constructed in the phage vector 
DASHIII (see below) and one recombinant phage, 
designated lambda DASHII-N1, which hybridized to the 
probe, was isolated. As described further below, 
DASHII-N1 was used to generate further probes. 

30 

Construction and screening of the S.noursei ATCC1455 
gene library 

The S.noursei ATCC114 55 gene library was 
constructed in phage lambda vector DASHII (Stratagene) 
35 according to manufacturer's instructions. Total DNA 
from S.noursei ATCC11455 was isolated as described in 
Hopwood et al (1985) supra, partially digested with Sau3 
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AI restriction enzyme and fractionated on the sucrose 
gradient as described in Sambrook et al (1989), supra. 
The fractions of S.noursei ATCC1455 DNA containing 
fragments of 13-17 Jcb in size were used for ligation 
5 with the DASHII vector arms digested with Ba/nHI 
restriction enzyme. E.coli XLl-Blue MRA (P2) 
(Stratagene) was used as a host for a gene library 
construction and propagation. 

DNA fragments to be used as probes for screening 

10 the gene library were purified from agarose gels using 
QIAGEN Kit (QIAGEN GmbH, Germany) , and labelled by the 
use of the digoxygenin (DIG) kit from Boehringer 
Mannheim (Germany), according to the manufacturer's 
instructions. Probes used for the library screening and 

15 relevant recombinant phages discovered: 



Probe 



PKS72 

E12 . 1 : from Nl 
20 E4 . 7 . 2 : from Nl 

L42E9 . 1 : from N42 
Bl . 0 58 : from N58 



DASHII recombinant phages found 

using this probe 

Nl, N14 

N41, N42, N44, N45, N48 
N58 

N64, N76 
N69 



Description of the probes: 

25 i. PKS72 probe. The 0.7k±> DNA fragment isolated from 
the pPKS72 plasmid (see above) with restriction enzymes 
EcoRI and Hindlll was used as a PKS72 probe. 
2. E12.1 probe. The 2.6 kb BamHI fragment from the 
insert of recombinant phage Nl, representing its left 

30 flanking region, was subcloned into pGEM3Zf(-), 

resulting in plasmid pGEM (B2 . 6) -1 . This plasmid was 
digested with EcoRI /Aval, and the 0.55 kb fragment 
corresponding to the left end of the Nl DNA insert was 
purified and used as E12.1 probe. 

35 3. E4.7.2 probe. The 4.7 Jcb EcoRI fragment from the 
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recombinant phage Nl, representing its right flanking 
region was subcloned into pGEM3Zf(-), resulting in 
plasmid pGEM (E4.7)-l. This plasmid was digested with 
EcoRl/Hindlll, and the 1.5kb DNA fragment corresponding 
5 to the right end of the Nl DNA insert was purified and 
named E4 . 7 . 2 . 

4. L4 2E9.1 probe. The 9 . 0 kb EcoRI fragment 
representing the left flanking region of the DNA insert 
in the recombinant phage N42 was subcloned into 

10 pGEM3Zf(-), resulting in plasmid pH42E9.1. This plasmid 
was digested with EcoRI / BamHl , and the 0 . 6 kb fragment 
corresponding to the left flank of the N4 2 DNA insert 
was purified and used as L42E9.1 probe. 

5. L58B1.0 probe. The 3 . 0 kb EcoRI fragment 
15 representing the right flanking region of the 

recombinant phage N58 DNA insert was subcloned into 
pGEM3Zf(-), resulting in plasmid pGEM (E3 . 0 ) -58 . From 
the latter plasmid, the 1 . 0 kb BamHI fragment which is 
located on the right end of the N58 DNA insert was 
20 purified and used as L58B1.0 probe. 

Gene disruption experiment with the nystatin 
biosynthesis gene cluster 

A 4.2 kb BamHI DNA fragment isolated from the 

2 5 DASHII-N1 recombinant phage was first cloned into the 
pGEM3Zf (-) vector in E.coli, resulting in the plasmid 
pGEM 4.2-1. DNA sequences on both ends of the cloned 
fragment were determined and, after database search, 
were found to encode PKS type I. The S.noursei DNA 

30 fragment cloned in pGEM4.2-l was excised from this 

plasmid with restriction enzymes EcoRI and Hindi II, and 
ligated with the 3 . 0 kb EcoRI / Hindi 1 1 fragment from the 
vector pSOK201 (see Table 3 and Figure 3) , and 
transformed into E.coli. Plasmid DNA designated 

35 pKO(4.2)-l was recovered from transf ormant s , and then 
was transferred into S.noursei ATCC1455 by conjugation 
as described in Example 1 under "Bacterial strains, 
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plasmids and growth conditions". 

Since pKO(4.2)-l is not capable of replicating in 
S. noursei ATCC11455, it was assumed that it will 
function as a suicide vector integrating into the genome 
5 of S.noursei via homologous recombination. As a result 
of such recombination, a PKS gene in the S.noursei 
ATCC11455 genome, for which 4.2 kb BamHI fragment cloned 
in pKO(4.2)-l was presumed to be internal, would have 
been inactivated by disruption of its coding region. 

10 Integration of pKO(4.2)-l into the genome of the three 

S.noursei transconjugants was confirmed by Southern blot 
analysis with the use of labelled 4 . 2 kb BaMil fragment 
from pGEM4.2-l as a probe. One of the S. noursei 
disruption mutants carrying pKO(4.2)-l integrated into 

15 its genome was tested for nystatin production in 

parallel with the parental strain ATCC11455 (see above 
for methods under "analysis of secondary metabolite 
production" . While the latter was shown to produce 
nystatin at the expected level, no nystatin production 

20 was detected in the pKO(4.2)-l disruption mutant, thus 
confirming the requirement of the identified PKS for 
nystatin biosynthesis. 

Cloning of the nystatin biosynthesis gene cluster 

25 In order to clone the entire gene cluster for the 

nystatin biosynthesis, the DNA fragments derived from 
the ends of the S. noursei DNA insert in the recombinant 
phage DASHII-N1 (Nl) , and subsequently found overlapping 
recombinant phages, were used as probes for screening 

30 the gene library (see above for probes) . This screen 
resulted in isolation of the recombinant phages N14, 
N41, N42, N44, N45, N48, N58, N64 , N69, and N76 
comprising two regions (SEQ ID NOs 1 and 2 respectively) 
of the S. noursei ATCC11455 genome (approx. 98 kb 

35 total), as depicted in Fig. 4. A gene disruption 

experiment with the 4 . 3 kb EcoRI/BamHI DNA fragment 
derived from the recombinant phage N64 (performed 
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essentially the same way as described above) , confirmed 
that the second region (SEQ ID NO. 2) also encodes 
nystatin biosynthesis genes. 

5 DNA sequence analysis of the nystatin biosynthesis gene 
cluster 

The complete DNA inserts from recombinant phages 
mentioned above were subcloned either as Xbal or EcoRI 
fragments into pGEM3Zf (-) vector in an E. coli host, and 

10 nucleotide sequences were determined on both DNA strands 
of these fragments. Computer-assisted analysis of the 
DNA sequences comprising the two regions of the nystatin 
biosynthesis gene cluster (SEQ ID NOs 1 and 2 
respectively) resulted in identification of the genes 

15 shown on Fig. 4 and listed in Table 4. 

Table 4. Genes identified with in the nystatin 
biosynthesis gene cluster of S.noursei 



20 



Designation 



Product 



Putative function 



25 



nysRl 

nysR2 

nysR3 

nysR4 

nysRS 

ORF1 
ORF2 

nysA 
nysB 



transcriptional 
activator 

transcriptional 
activator 

transcriptional 
activator 

transcriptional 
activator 

transcriptional 
repressor 

peptidase 

transcriptional 
activator 

PKS type I 



PKS type I 



regulation of 
nystatin production 

regulation of 
nystatin production 

regulation of 
nystatin production 

regulation 



regulation 

pept ide me t abol i sm 
regulation 

nystatin polyketide 
backbone synthesis 
(loading module) 

nystatin polyketide 
backbone synthesis 
(modules 1&2) 
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10 



nysC 



nysl 

(incomplete) 



nysE 
nysDl 

nysD2 

nysD3 

nysH 
nysG 
nysF 



PKS type I 



PKS type I 



thioesterase 



glycosyl transferase 



aminotransferase 

GDP-mannose-4 , 6- 
dehydratase 

ABC transporter 

ABC transporter 

4'- 

pho sphopa n t e t he i ne 
transferase 



nystatin polyketide 
backbone synthesis 
(modules 3-8) 

nystatin polyketide 
backbone synthesis 
(modules 9-13) 

release of final 
product from PKS 

attachment of 
mycosamine to the 
polyketide backbone 

mycosamine 
biosynthesis 

mycosamine 
biosynthesis 

efflux of nystatin 

efflux of nystatin 

post -translational 
modification of PKS 



Three complete {nysA, nysB and nysC) (in SEQ ID 
NO:l), and one incomplete (nysl) genes (in SEQ ID NO: 2) 

15 encoding the PKSs type I were identified. The amino 
acid (aa) sequences of the products encoded by these 
four genes were analysed by comparison to the aa 
sequences of known PKSs type I (see also Table 2 above 
for molecule features) . Since all four proteins 

2 0 displayed high degree of homology towards rifamycin and 
rapamycin PKSs (Aparicio et al . , Gene, v. 169: 9-16, 
1996; Tang et al . , Gene, v. 216: 255-265, 1998), 
presumptive functional analysis of nystatin PKSs was 
based on comparison to the formers. The NysA protein of 

25 1366 aa encodes one module of PKS composed of KS S , AT, 
DH, KR, and ACP domains. The lack of a conserved 
cysteine residue in KS S domain suggests that this module 
cannot perform condensation reaction, and thus most 
probably represents a loading module providing the 

30 acetate starter unit for initiation of nystatin 
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polyketide backbone biosynthesis (Bisang et al . , Nature, 
v. 401: 502-505, 1999). Analysis of the 3192 aa 
sequence of NysB revealed that it contains two modules 
with KS, AT, KR, ACP, and apparently inactive DH 
5 domains. The AT domains identified in both NysB modules 
display features characteristic for the 
methylmalonylCoA-specif ic AT domains (Haydock et al . , 
FEBS Lett., v. 374: 246-248, 1995). This feature of the 
NysB protein suggests that it comprises 1st and 2nd 

10 modules involved in the nystatin polyketide backbone 
biosynthetic pathway, as the only two proximal 
methylmalonyl CoAs incorporated in nystatin molecule are 
the first two extender units. NysC protein of 11096 aa, 
the largest, to our knowledge, bacterial polypeptide 

15 discovered so far, is composed of 6 modules apparently 
responsible for the condensation steps 3 to 8 in 
nystatin polyketide chain biosynthesis (incorporation of 
C21 - C32) . Module 5 of the NysC protein contains an ER 
domain, which is accountable for the reduction of the 

20 double bond between C29 and C28 (see Fig. 1) . Besides 
module 5, all other modules in NysC are similar in that 
they all contain KS, AT, DH, KR and ACP domains. It was 
noticed that KR domains in modules 4 and 5 are 100% 
identical on the aa sequence level, and 99.9% identical 

25 at the level of DNA sequences encoding these domains. 
Thus, KR domains in modules 4 and 5 most probably 
represent an example of a relatively recent duplication 
in the process of evolution. Nysl C- terminally 
truncated protein of 7066 aa, for which aa sequence 

30 information at the C-terminus is still missing, is 
composed of at least 4 complete modules. All these 
modules contain KS, AT, DH, KR and ACP domains, but the 
DH domains in three modules are apparently inactive, 
suggesting that the known part of Nysl PKS is 

3 5 responsible for the elongation steps 9 through 12 
(incorporation of C13 - C20) . This assumption is 
further supported by the fact that the AT domain in 
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module 11 (based on sequence similarity) is specific for 
methylmalonylCoA extender, while all other AT domains in 
NysA, NysC and Nysl are malonyl CoA- specif ic . Thus, the 
AT domain in module 11 is presumably responsible for the 
5 occurrence of the methyl group at CI 6 on the nystatin 
polyketide backbone, which is later oxidized to give a 
CIS -coupled carboxyl (see Fig. 1) . 

Downstream of the nysC gene, a coding sequence nysE 
for thioesterase, presumably responsible for the release 

10 of mature polyketide chain from the nystatin PKSs, was 
found. The nysRl, nysR2 and nysR3 genes, were found 
downstream of nysE. The products of these genes are 
homologous to the presumed transcriptional regulators. 
NysRl (966 aa) , NysR2 (953 aa) , and NysR3 (927 aa) 

15 proteins were all found to contain putative 

helix-turn-helix (HTH) DNA binding motifs of LuxR type 
at their C- termini. Beside that, NysRl contained a 
distinct ATP/GTP binding motif, and NysR3 contained a 
"leucine zipper" (putative DNA binding) motif at their 

2 0 N- termini. The gene encoding NysRl was found to contain 

a rare TTA codon (for Leull) close to the beginning of 
the gene, suggesting that NysRl expression might be 
regulated in S. noursei at the level of translation by a 
bldA-like gene (White & Bibb, J. Bacteriol , v. 179: 627- 
25 633, 1997) . 

In order to confirm the involvement of nysRl gene 
in the nystatin biosynthesis a gene disruption 
experiment was performed with the use of a 1379 bp Apal 
DNA fragment internal for the nysRl coding sequence. 

3 0 The 1379 bp Apal DNA fragment internal for the nysRl 

coding sequence and representing nt 51531-52910 of SEQ 
ID NO 1 was cloned into the Apal site of the pGEMUZf (-) 
vector, giving the recombinant plasmid pNRDl . The 1430 
bp EcoRI/Hindlll DNA fragment isolated from pNRDl was 
35 ligated with the 3 . 0 kb EcoRI/Hindlll fragment of 

pSOK2 01 resulting in the pNRD2 vector. The latter was 
subsequently used for nysRl gene disruption in S. 
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noursei ATCC11455 via conjugation from E. coli ET12567 
(pUZ8002) . 

Analysis of the nystatin production by the nysRl 
disruption mutant revealed that it is not capable of 
5 producing nystatin. It shall be noted, however, that 

the phenotype observed for the nysRl disruption mutant, 
might reflect the polar effect of mutation on the nysR2 
and nysR3 genes, which can be co-transcribed with nysRl. 
Downstream of nysR3 , the genes nysR4 f nysR5, ORF1, 

10 and ORF2 were found. The nysR4 gene product of 210 aa 
(this has subsequently found to be 266 aa expression 
from an upstream start codon at nucleotide 120628 in SEQ 
ID No. 35) shows similarity to transcriptional 
activators of response regulator type, and contains 

15 centrally located ATP/GTP binding and C-terminally 

located LuxR-type HTH DNA binding motifs. NysR5 protein 
of 253 aa displays similarity to the transcriptional 
repressors, and contains a putative DeoR-type HTH DNA 
binding motif at its N terminus. It seems likely that 

20 nysR4 and nysR5 gene products are involved in regulation 
of nystatin biosynthesis based on their location 
proximal to the nystatin biosynthesis genes. ORF2 , 
located downstream of nysRB, and transcribed in the 
opposite direction, encodes a 354 aa peptide showing 

25 similarity to transcriptional activators and having 

centrally located putative HTH DNA binding motif of AsnC 
type. Whether this gene is involved in regulation of 
nystatin biosynthesis is not apparent, as 0RF1, located 
immediately upstream of 0RF2 , encodes a putative 

30 peptidase. It seems likely that ORF2 is rather involved 
in regulation of 0RF1, but to confirm this, 
experimentation on ORF2 inactivation is required. The 
fact that the gene encoding a peptidase, for which no 
role in nystatin biosynthesis could be assigned, was 

35 found on the right flank of the sequenced region, 
suggests that the right border of the nystatin 
biosynthesis gene cluster had been identified. 
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On the left flank of the sequenced region 
encompassing the genes described above, two genes 
located upstream of nysA, and transcribed in the 
direction opposite to nysA were identified (Fig. 4) . 
5 The nysDl gene product of 506 aa displays considerable 
homology to the UDP-glucuronosyltransf erases from 
mammals. This enzyme belongs to the 

UDP-glycosyltransferase family, and takes part in the 
process of elimination of potentially toxic xenobiotics 

10 by the way of their glycosylation. It seems likely that 
NysDl represents a glycosyltransf erase responsible for 
the attachment of the deoxysugar moiety (mycosamine) to 
the nystatin polyketide backbone." The" product of nysD2 
shows a high degree of homology to the perosamine 

15 synthetases and transaminases responsible for the 

attachment of amino groups to the deoxysugars in the 
process of their biosynthesis. Thus, NysD2 presumably 
represents an aminotransferase involved in mycosamine 
biosynthesis . 

20 Beside nysl encoding PKS type I, four other genes 

were identified within the second sequenced region (SEQ 
ID NO. 2) of the nystatin biosynthesis gene cluster of 
S. noursei ATCC11455 (Fig. 4) . The 344 aa protein 
encoded by the nysD3 gene is highly homologous to the 

25 GDP-mannose- 4, 6 -dehydratases required for deoxysugar 

formation. It is thus likely that the NysD3 protein is 
involved in biosynthesis of the nystatin mycosamine 
moiety in S. noursei ATCC11455. nysH and nysG gene 
products of 584 aa and 605 aa, respectively, display a 

30 high degree of similarity to transporters of the ABC 
family. Both NysH and NysG polypeptides contain a 
distinct ABC transporter signature at their C-termini, 
centrally located ATP/GTP binding motifs, and 
N-terminally located transmembrane regions. These two 

35 proteins most probably are responsible for the 

ATP-dependent active efflux of nystatin from the 
producing organism, thus eliminating the danger of 



WO 01/59126 



PCT/GB01/00509 



- 68 - 

membrane clogging by the hydrophobic nystatin molecules. 
The product of the nysF gene, a 245 aa polypeptide, 
displays homology to the 4 1 -phosphopantetheine 
transferases. The latter enzyme is responsible for the 
5 post-translational modification of the ACP domains of 
the PKSs, and is required for its full functionality 
(Cox et al., FEBS Lett., v. 405: 267-272, 1997; Kealey 
et al., Proc. Natl. Acad. Sci. USA, v. 95: 505-509, 
1998) . It seems likely, therefore, that the NysF 
10 protein functions in post- translational modification of 
nystatin PKSs, and is required for nystatin 
biosynthesis . 

Example 2 - Genetic manipulation of the nystatin PKS 
15 genes leading to production of novel polyene antibiotics 

Nystatin belongs to the group of the polyene 
macrolide compounds, which are characterized by having 3 
to 8 conjugated double bonds in their macrolactone ring. 
20 Nystatin itself is a tetraene, having 4 conjugated 

double bonds between C20 and C27. There is also a set 
of 2 conjugated double bonds on the nystatin molecule, 
between C30 and C33,' which is separated from the set of 
4 conjugated double bonds by C28-C29 (see Fig. 1) . From 

2 5 the computer-assisted analysis of the NysC PKS it became 

apparent that the ER domain in module 5 in this protein 
is responsible for reduction of the double bond between 
C28 and C29. Thus, by inactivating this particular 
domain, it is theoretically possible to obtain a 

3 0 compound with a double bond between C28 and C2 9, thus 

joining two sets of conjugated double bonds in the 
nystatin molecule, and creating a heptaene macrolide 
compound . 

To inactivate the ER domain in module 5 of NysC, 
35 the method of "in-frame" deletion within the nysC gene 
was chosen. The construction of the vector pERD4 . 2 for 
gene replacement in the NysC- encoding genomic region of 
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S. noursei was as follows: 

Inactivation of ER domain in module 5 of NysC 

PCR-assisted amplification of the 394 bp DNA 
5 fragment representing the coding sequence for the C- 

terminal part of the ER domain in module 5 of NysC, and 
the coding sequence for the N- terminal part of the KR 
domain in module 5 (nt 32174 - 32559), was performed. 
The oligonucleotide primer ERD1 (5'- 

10 GTTGGTACCCCACTCCCGGTCCGCAC-3', sense) (SEQ ID NO. 25) was 
selected from the nucleotide sequence of nysC gene 
comprising nt 32174-32190 (SEQ ID NO. 1) with additional 
nucleotides on the 5' end in order to create a Kpnl 
restriction enzyme cleavage site. The oligonucleotide 

15 primer ERD2 (5'-CCAGCCGCATGCACCACC-3' , antisense (SEQ ID 
NO. 26)) was selected from the nysC coding DNA sequence, 
and comprised the DNA segment between nt 32559-32542 
(SEQ ID NO. 1) containing a SphI restriction enzyme 
cleavage site. The resulting PCR fragment was digested 

20 with Kpnl and SphI, and ligated together with the 1828 

bp BamHI/Kpnl DNA fragment (nt 29224-31052 of SEQ ID NO. 
1), and the 1273 bp Sphl/EcoRI DNA fragment (nt 32548- 
33821 of SEQ ID NO. 1) into the EcoRl/Ba/nHI - digested 
pGEM3Zf (-) vector. The ligation mixture was transformed 

25 into the E.coli DH5a, and recombinant plasmid of 6.7 kb 
designated pERD4.1 was recovered from one of the 
transf ormants . The latter contained a hybrid DNA 
fragment representing nt 29224-33821 of the SEQ ID NO. 1 
DNA sequence with internal deletion between nt 31052 and 

30 nt 32174 of the nysC coding region. This deletion 

eliminated the coding regions (aa 4837 to 5208 of nysC) 
for the part of the DH-ER interdomain linker and C- 
terminal part of the ER domain containing a putative 
NADP(H) binding site. 

35 To construct the vector for inactivation of NysC 

ER4 domain in S. noursei, the recombinant DNA fragment 
was excised with EcoRI and tfindlll restriction enzymes 
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from pERD4.1 and ligated with the 3 . 0 kb EcoRI/tfindlll 
DNA fragment from pSOK201, and the ligation mixture was 
transformed into the E.coli DH5a. Plasmid pERD4 . 2 of 
6.5 kb was isolated from one of the transf ormants and 
5 used to perform the gene replacement procedure in 
S.noursei ATCC11455 (see below) . The recombinant 
S.noursei strains selected after this procedure were 
designated ERD44 and ERD48. 

The latter plasmid was introduced into S. noursei 

10 ATCC11455 by intergeneric conjugation, and one 

transconjugant , S. noursei (pERD4.2), was chosen for 
further manipulations. After the correct mode of 
integration of pERD4 . 2 into the genome of the S. noursei 
(pERD4.2) was confirmed by Southern blot analysis, 

15 selection for the second crossover event was carried out 
as described in Sekurova et al., FEMS Microbiol Lett, 
v.177: 297-304, 1999 (and see below). 

Gene replacement procedure 

20 This method is carried out as described by Sekurova 

et al . , 1999, supra. 

The plasmid constructed for gene replacement as 
described above was introduced into S .noursei by 
conjugation from the E.coli ET 12567 (pUZ8002) . One of 

25 the clones carrying the plasmid integrated into the 

chromosome via homologous recombination was subjected to 
three rounds of sporulation on antibiotic-free ISP2 agar 
medium, and the progeny after the third round was tested 
for the loss of antibiotic resistance marker. Southern 

30 blot analysis of the total DNA isolated from several 

antibiotic-sensitive strains with an appropriate probe 
was used to confirm the desired mutation. 

Of 8 colonies, which had lost the selection marker, 
and. thus undergone a second crossover event, 4 were 

35 shown by Southern blot analysis to have reverted to the 
wild- type genotype. Two strains were shown to contain a 
large deletion apparently eliminating a substantial 
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portion of the nystatin gene cluster, while two other 
mutants contained either the desired 1116 bp deletion 
(ERD44), or a deletion which was somewhat larger than 
expected (ERD4 8) . Analysis of the polyene antibiotic 
5 production by the strains ERD44 and ERD48 revealed that 
they do not produce nystatin. Instead, ERD44 was shown 
to produce a polyene compound having UV spectrum peaks 
characteristic for heptaens (Figure 6), as expected. 
Surprisingly, the ERD48 mutant was shown to produce 

10 hexaenic polyenes (according to spectroscopic analysis) , 
which would be consistent with "in- frame" deletion of 
the complete module 5 from the NysC protein. In order to 
investigate the event which occurred in the ERD4 8 mutant 
in more detail, the DNA fragment was PCR-amplif ied from 

15 the genome of S. noursei ERD48 mutant, which would 

encompass the putative product of such deletion (see 
below) . 

PCR-assisted amplification and cloning of the PKS- 

2 0 encoding DNA fragment from the S.noursei ERD4 8 genome 

The PCR reaction aimed at amplification of part of 
the mutant nysC gene in S.noursei ERD48 was. carried out 
with oligonucleotide primers KR48.1 (sense, 5'-CCG CGT 
CGG ATC CGC CGA C-3') (SEQ ID NO: 27) and KR48.2 

25 (antisense, 5'-AGC CTT CGA ATT CGG CGC C-3') (SEQ ID NO: 

28) which corresponded to the nt 24744-24760 and 
nt33818-33833, respectively, of the DNA sequence in SEQ 
ID NO. 1. The 50 /il PCR mixture contained 0.1 /xg of 
S.noursei ERD48 total DNA, 25 pm of each KR48.1 and 

30 KR48.2 oligonucleotide primers, dNTPs (final 

concentration 350ptm) , lxPCR buffer from Expand High 
Fidelity PCR System (Boehringer Mannheim) and 1.5 U of 
the DNA polymerase mixture from the same system. The 
PCR was performed on the Perlcin Elmer GeneAmp PCR System 

35 2400 with the following program: 1 cycle of denaturation 
at 96°C (4 min) , 35 cycles of denaturation/ annealing/ 
synthesis at 94°C (45 sec) and 70°C (10 min) and 1 full 
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cycle of final annealing/extension at 72°C (10 min) . 
The 2.7 kb DNA fragment obtained with this procedure was 
digested with EcoRI and BaniRI restriction enzymes, and 
ligated with pGEM3Zf (-) vector DNA digested with the 
5 same enzymes. The ligation mixture was introduced into 
E.coli DH5a by transformation, the plasmid pKR48 of 
5.9kb was isolated from one of the transf ormants and 
subjected to DNA sequence analysis. 

The DNA sequence of the insert in pKR4 8 is present 
10 in SEQ ID NO. 29, (identified herein as ERD48 seq) . The 
translation product is shown in SEQ ID NO. 30 and is a 
899 aa protein - the molecule features of SEQ ID NOs 29 
and 3 0 respectively are shown below: 

4 

15 SEP ID NO: 2 9 (DNA: ERD4 8 . sea) 

Start End Name Description 

1 2 54 DH4 DH4 domain coding region, 

C- terminal 

1170 1913 KR4/5 hybrid ketoreductase 

domain, module 4/5 
2010 2231 ACP5 ACP5 domain coding region 

20 2295 2700 KS5 KS5 domain coding region 



SEP ID NO: 3 0 (AA : translation product) 



Start End Name Description 



1 


84 


DH4 


DH4 domain module 4, 
terminus 


C 


390 


637 


KR4/5 


hybrid KR domain 




670 


743 


ACP5 


ACP domain, module 5 




764 


899 


KS5 


KS domain, module 5, 


N 



terminus 



3 0 The cloning and DNA sequencing of this 2700 bp 

fragment confirmed that it encodes a part of a hybrid 
PKS module, which would be consistent with recombination 
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between DNA sequences encoding highly homologous KR 
domains in modules 4 and 5 of NysC. This recombination 
apparently has led to deletion of DNA sequences encoding 
C- terminal parts of KR and ACP domains of module 4, and 
5 KS, AT, DH, ER domains, and N- terminal part of the KR 
domain of module 5, thus resulting in the loss of one 
complete module from NysC PKS . 

Preliminary analysis of the compounds produced by 
the S. noursei ERD44 and ERD48 was carried out. It was 
10 shown that a heptaenic compound produced by the ERD44 
mutant has high antifungal activity against Candida 
albicans, an organism used in tests for antifungal 
activity. At this point it is not possible to 
accurately assay the activity of this compound 
15 (tentatively named S44) , because it is not yet properly 
purified, and its exact concentration is difficult to 
estimate. However, some rough estimates based on the UV 
absorbance at the wave lengths characteristic for 
nystatin, S44, and amphotericin (a heptaenic macrolide) , 
20 suggest that S44 might have 4-5 times higher antifungal 
activity compared to nystatin. HPLC analysis of the 
compounds produced by the ERD48 mutant suggests that at 
least 5 hexaenic macrolides with different retention 
times are produced by this strain (mixture called S48) . 
25 This probably reflects the different states of 
modifications of the macrolactone ring by i.e. 
glycosylation at C19, hydroxylation at C10, or oxidation 
of the methyl group at C16. This could have been 
expected, since reduction of the macrolactone ring size 
30 most probably leads to the lower affinity of the 
modifying enzymes towards the new substrate. 

Antifungal activity of the S48 mixture was tested, 
and found to constitute approx. 10% of nystatin 
activity. It seems probable that only one of the 
3 5 compounds in the S4 8 mixture produced by ERD4 8 , which is 
fully decorated by the ring -modifying enzymes, is 
responsible for the antifungal activity detected. Thus, 
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relative antifungal activity of this compound is 
impossible to assess prior to its purification. Some of 
the hexane antibiotics are known to have antibacterial, 
as well as antifungal activity (Ciftci, et al . , J. 
5 Antibiot, v. 37: 876-884, 1984). It is thus possible 

that hexaenic compounds produced by the ERD48 mutant can 
be used for production of antibacterial agents. The 
changes in the NysC protein leading to production of new 
polyene compounds in ERD44 and ERD48 mutants, along with 
10 the predicted structures of their macrolactone rings are 
presented in Fig. 5. 

Inactivation of DH8 domain in module 8 of NysC 

The possibility of genetically manipulating the 
15 nystatin PKS was further exemplified by inactivation of 
the DH domain in module 8 of NysC. The plasmid pNPRl . 1 
for gene replacement within nysC gene, which would 
result in in- frame deletion of the DNA region encoding 
DH8 domain was constructed as below: 
20 The 3989 bp Kpnl/Bcll DNA fragment (nt 43004-46993 

of the region 1 DNA sequence (SEQ ID NO. 1) and the 2409 
bp BamHl/EcoRI DNA fragment (nt 47680-50089 of the same) 
were excised from the DNA of recombinant phage Nl and 
ligated with vector pGEM3Zf (-) DNA digested with EcoRI 
25 and JCpnl. The ligation mixture was transformed into 
E.coli DH5a, and recombinant plasmid pGEM-NPRl was 
isolated from one of the transformants . The latter 
contained the hybrid DNA fragment representing the nt 
43004-50089 of the region 1 DNA sequence (SEQ ID NO:l) 
30 with the internal deletion between nt 46993 and nt 

47680. This deletion eliminated the DNA region encoding 
the aa 10150 to 10378 of the NysC polypeptide, thus 
affecting the DH8 domain in module 8, and DH8-KR 
interdomain linker in module 8. To construct the vector 
35 for inactivation of the NysC DH8 domain in S.noursei, 

the recombinant DNA fragment was excised with EcoRI and 
Hindlll restriction enzymes from pGEM-NPRl ligated with 
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the 3.0 kb EcoRI/HindlU DNA fragment from pSOK201 (see 
Fig. 3 and Table 3) and the ligation mixture was 
transformed into the E.coli DH5a. Plasmid pNPRl . 1 of 
9.4 kb was isolated from one of the transf ormants and 
5 used to perform the gene replacement procedure in 

S.noursei ATCC11455 according to Sekurova et al., 1999, 
supra . 

The recombinant S.noursei strain selected after 
this procedure was designated NPR1.1 The S. noursei 

10 NPR1.1 recombinant strain was shown by Southern blot 
analysis to contain the desired deletion in the 
DH8-coding sequence of nysC. Analysis of the secondary 
metabolites in the culture extracts of the S. noursei 
NPR1 recombinant strain by thin layer chromatography 

15 (TLC) revealed the presence of presumed macrolide 

compounds. The relative mobility of these compounds 
differed from nystatin, and no UV spectra characteristic 
for nystatin could be detected in the extracts. It was 
suggested, that in the new molecule (s) produced by the 

20 NPR1 recombinant a set of 4 double bonds on the nystatin 
aglycone has been disturbed, and that the macrolactone 
ring now contains a hydroxy group attached at C23 (Table 
5) . No attempts to purify the compound (s) from NPR1 
were made, as the bioassay against Candida albicans made 

25 with the NPR1 culture extracts showed very low 

antifungal activity. However, the NPR1 mutant can be 
potentially useful for further manipulations with the 
nystatin PKS. 

30 Jnactivation of KR domain in module 7 (NysC) 

The 44 04 bp DNA fragment was excised with EcoRI and 
Srnal restriction enzymes from the DNA of recombinant * 
phage Nl . The EcoRI site is situated in the polylinker 
of phage Nl to the left of the S.noursei DNA insert 
35 starting at nt 383 98 of the region 1 DNA sequence (SEQ 
ID NO. 1), while the Smal site corresponds to nt 42802 
of the region 1 DNA sequence. The 3303 bp DNA fragment 
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was excised with Smal and BawHI restriction enzymes from 
the DNA of the same recombinant phage. The Smal and 
BamHI restriction sites are situated at nt 43099 and nt 
46402 respectively, of the region 1 DNA sequence (SEQ ID 
5 NO. 1) . Both DNA fragments were ligated with the 

pGEM3Zf (-) DNA digested with restriction enzymes EcoRI 
and BamHI, and the ligation mixture was transformed in 
Escherichia coli DH5a . The plasmid pGEM-NPR2 of 10.7 kb 
was recovered from one of the transf ormants, which 

10 contained a recombinant DNA fragment which represented 
nt 38398-46402 of the region 1 DNA sequence (SEQ ID NO. 
1) with the internal deletion between nt 42802 and nt 
43099. This deletion results in elimination of the DNA 
region encoding aa 8753 to 8851 of the NysC protein 

15 encompassing a putative NADP(H) binding site in the KR 
domain of module 7. To construct the vector for 
inactivation of NysC KR7 domain in S.noursei, the 
recombinant DNA fragment was excised from pGEM-NPR2 with 
EcoRI and tfindlll restriction enzymes, and ligated with 

20 the 3.0 kb EcoRl/Hindlll DNA fragment from pSOK201, and 
the ligation mixture was transformed into the E.coli 
DH5a. Plasmid pNPR2 of 10.7 kb was isolated from one of 
the transformants and used to perform the gene 
replacement procedure in S.noursei ATCC11455 (according 

25 to Sekurova et al . , 1999, supra). The recombinant 
S.noursei strain selected after this procedure was 
designated NPR2 . 1 . 

Analysis of the secondary metabolites in the 
culture extracts of the S. noursei NPR2 . 1 recombinant 

30 strain by TLC revealed the presence of presumed 

macrolide compounds. The relative mobility of these 
compounds differed from nystatin, and from the 
metabolites produced by NPR1 mutant. No UV spectra 
characteristic for nystatin could be detected in the 

35 extracts. It is suggested, that in the new molecule (s) 
produced by the NPR2.1 recombinant a set of 4 double 
bonds on the nystatin aglycone has been disturbed, and 
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the macrolactone ring now contains a keto group at C25 
(Table 5) . No attempts at purification of the 
compound(s) from NPR2 . 1 were made, and no bioassays for 
antifungal activity with the NPR2 . 1 culture extracts 
5 were performed. This mutant has utility not only by 
virtue of the metabolites it produces, but also for 
further manipulation with nystatin PKS. 

Inactivation of the ER domain in module 5 of the mutated 

10 NysC protein NysC_DH8) 

To introduce the second mutation into the NysC 
protein with inactivated DH domain in module 8 
(NysC_DH7), the plasmid pERD4.2 was introduced into the 
S.noursei mutant NPR1.1 and the gene replacement 

15 procedure was carried out as described in Sekurova et 
al . , 1999, supra. This yielded recombinant S.noursei 
strain ERDH9 with mutations in both ER5 and DH8 coding 
sequences of nysC. The combination of these two 
mutations presumably leads to biosynthesis of the 

2 0 pentaenic nystatin derivative with a hydroxy group at 

C23 (Table 5) . Preliminary analysis of the ERDH9 
culture extracts confirmed that a polyene compound (s) is 
being produced by this strain although in quantities 
making identification of its true UV spectrum difficult. 
25 The preliminary data also show that this compound is 

preferentially accumulated in the culture supernatant, 
while nystatin produced by the wild-type S. noursei 
remains mostly associated with mycelium. This was 
consistent with the hypothesis that an additional 

3 0 hydroxy group on the nystatin molecule is responsible 

for increased water solubility of the compound (s). No 
attempts at purification of this new compound (s) were 
made, and no bioassays were performed. 

To date, the Inventors have been unable to confirm 
3 5 t he structure of the products of mutants NPR1.1, NPR2 . 1 
and ERDH9. 
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Table 5. Predicted structures of nystatin derivatives 
produced via genetic engineering of nysQ gene in 
5. noursei (see Example 2 for details) 



Mutant Expected structure 

(polyketidc moiety only) 



UV spectrum, 
nm (DMSO) 



Activity 



ATCC11455 nystatin (Fig. 1) 



ERD44 




299,312, 327 
375, 395,419, 444 



normal 



high 



ERD48 




CM, 



336, 352,370, 391 



low? 



NPR1 



NPR2.1 




O OH 



low 



ERDH9 
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Example 3 - Manipulation of the regulatory genes leading 
to increased production of nystatin 

Expression of nysRl under the control of PermE* promoter 
5 in S. noursei ATCC11455 

To further confirm that nysRl gene encodes a 
transcriptional activator of the nystatin biosynthesis 
genes, the latter was expressed in S. noursei ATCC11455 
under the control of the PermE* promoter (Bibb et al . , 

10 Mol. Microbiol., v. 14: 433-545, 1994). First-, the 
plasmid pSOK804 for stable and efficient integration 
into the S. noursei ATCC11455 genome was constructed 
(Fig. 2). This was made by ligating together the 3.0 kb 
SphIfHindl.il DNA fragment from pSET152 (Bierman et al . , 

15 1992, supra) and 2 . 3 kb Sphr/tfindlll fragment from 

bacteriophage VWB carrying functions necessary for site- 
specific integration (Van Mellaert et al., 1998, supra). 
Conjugation of pSOK804 from E. coli ET12567 (pUZ8002) 
into S. noursei ATCC1455 demonstrated that this plasmid 

20 integrates specifically into one site of the S. noursei 
genome at a frequency of 3-10" 6 . 

To clone the nysRl gene under PermE* promoter in 
pSOK804, the following procedure was employed. The N- 
terminal part of nysRl was PCR-amplif ied from the 

25 recombinant phage Nl DNA template with the 
oligonucleotide primers NR1 . 1 (sense) 5 1 - 

CGCCGCATGCTGTTCTCACCCCACGT-3 ■ (SEQ ID NO: 31), and NR1 . 2 
(antisense) : 5 ' -GGCGCGACCCGGTTCGGCCT-3 1 (SEQ ID NO: 32). 
The oligonucleotide primer NR1 . 1 sequence corresponded 

30 to the nt 51376-51391 of SEQ ID NO: 1 with addition of 
nucleotides CGCC GCATGC at the 5' end to create a site 
for SphI endonuclease . The oligonucleotide primer NR1 . 2 
corresponded to the sequence complementary to nt 51964- 
51982 of SEQ ID NO: 1, and encompassed a restriction 

35 site for Agrel endonuclease. A 0 . 6 kb fragment was PCR- 
amplified using the phage Nl DNA as a template under 
conditions decribed in Example 1, and digested with SphI 
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and Agel. The digested 0 . 6 kb fragment was ligated 
together with a 2.8 kb Agel/EcoRI DNA fragment from the 
phage Nl insert {EcoRI site originating from the phage's 
polylinker) into pGEM7Zf (-) vector digested with SphI 
5 and EcoRI. The 2 . 8 kb fragment used in this ligation 
corresponded to nt 51971-54814 of SEQ ID NO: 1, and 
encompassed the C- terminal part of nysRl and N- terminal 
part of nysR2 (encoding the first 162 aa of NysR2) . The 
recombinant plasmid obtained as a result of this 

10 ligation was designated pNREl. The 3 . 5 kb SphI / Hindi I I 
DNA fragment from pNREl was ligated together with the 
0.3 kb EcoRI/ SphI fragment from pGEM7ermELi (see Table 
3) containing PermE* promoter into pSOK804 vector 
digested with EcoRI /Hindi I I . The resulting plasmid, 

15 pNRE2 , was introduced into the S. noursei ATCC11455 by 
conjugation (see Example 1) yielding recombinant strain 
S. noursei (pNRE2) . Analysis of the nystatin production 
by the latter strain in shake-flasks with reduced 
glucose medium revealed that it produces 50% more 

20 nystatin compared to the wild-type strain, most likely 
due to overexpression of the nysRl gene from the PermE* 
promoter. It therefore appears that nysRl gene encodes 
a positive activator that may be used for enhancing the 
production of nystatin and its derivatives in S. noursei 

25 strains. 

Partial deletion of nysRS gene in the S. noursei 
ATCC11455 

To confirm the function of the nysRS gene predicted 
30 through analysis of its coding DNA sequence (see Example 
1) in the regulation of nystatin biosynthesis, a 
specific mutation in S. noursei ATCC11455 genome was 
introduced. A DNA fragment from the S. noursei 
ACTT11455 genome encompassing nt 62037-63360 of the 
35 nucleotide sequence reported here was amplified by PCR 

with the primers NR5D1 (5 ' -GCGAGCGGCCGCTTCACCCCGCAACTCA- 
3') (SEQ ID NO: 33) and NR5D2 (5'- 
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CGCGAAGCTTGGCCGACTGCTCGACGTC-3 1 ) (SEQ ID NO: 34). The 
conditions for PCR were as described above. The 1341 bp 
PCR product was digested with NotI and Hindi II, and 
ligated with the 1688 bp EcoRl/NotI DNA fragment (nt 
5 60301-61989) and 3.0 kb EcoRl/ Hindi I I fragment from 

pSOK201. The resulting plasmid, pNRSD, contained the S. 
noursei DNA fragment with a 43 bp deletion in the coding 
region of nysRS gene. This deletion creates a frame- 
shift mutation within the nysR5 coding region, 
10 subsequently leading to truncation of its product. As a 
result of such truncation, 165 C-terminal amino acids of 
NysRS are eliminated, and replaced with 14 amino acids 
encoded by another reading frame (and thus unrelated to 
NysRS) . The pNRSD plasmid was used to perform a gene 
15 replacement procedure in S. noursei ATCC11455 as 
described in Sekurova et al . , 1999, supra. 

The mutation introduced through gene replacement 
led to a 5-15% increase in nystatin production by the 
resulting recombinant strain NR5D, compared to the wild- 
20 type S. noursei. Subtle but reproducible positive 
effect of NysRS C-terminal deletion on nystatin 
biosynthesis correlates well with the putative repressor 
function assigned to this protein based on computer 
analysis (see Example 1) . Since the deletion introduced 
25 in the nysR5 gene does not eliminate the N-terminally 
located putative helix- turn-helix motif identified in 
this protein, the residual repressor activity of the 
truncated NysRS polypeptide could account for the 
relatively small effect of this mutation on nystatin 
30 production. Nevertheless, this result confirms the 

usefulness of introducing mutations in the repressor- 
encoding gene as further means for enhancing the 
production of nystatin and its derivatives. 



35 
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Example 4 - Completion of the sequencing of the nystatin 
biosynthesis gene cluster 

The DNA sequence spanning the gap between SEQ ID 
5 No. 1 and SEQ ID No. 2 was determined for both DNA 

strands on the overlapping inserts in recombinant phages 
N20, N32, N95, N98, and N99 covering this region. 

Procedures used for sequencing, probe generation 
and screening were as described in Example 1. The 
10 probes used for library screening, and the relevant 
recombinant phages discovered were as follows: 



PKS 72 N20, N32 

L44ES3.5 N90 

15 L76SN0.5 N95 

L20S0.64 N98, N99 

Description of the probes : 

6. L44ES3.5 probe. The 3.5 kb DNA fragment 
20 isolated from the phage N44 with restriction enzymes 

EcoRI and Sail, was used as a L44ES3.5 probe. 

7. L76SN0.5 probe. The 0 . 5 kb DNA fragment 
isolated from the recombinant phage N76 with restriction 
enzymes Sad and NotI, was used as a L76SN0.5 probe. 

25 8. L20S0.64 probe. The 0.64 kb DNA fragment 

isolated from the plasmid pL20EB3.7 with the restriction 
enzyme Sail, was used as a L20S0.64 probe. 

New sequence information resulted in identification 
of complete nysl and nysDII genes, as well as the new 

30 genes nysJ, nysK, nysL, nysM, and nysN (see Figure 7) . 
According to the new information, the Nysl protein of 
9477 aa (SEQ ID No. 37) represents a PKS composed of six 
modules, responsible for the elongation steps 9 to 14 of 
the nystatin polyketide backbone biosynthesis. All 

3 5 these modules contain KS, AT, DH, KR and ACP domains. 

The presence of a mAT domain in module 11 is consistent 
with incorporation of methylmalonyl-CoA extender at this 
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elongation step. The DH domains in modules 10, 11, 12, 
and 14 seem to be inactive due the absence of the active 
site motif H(X 3 )G(X 4 )P. The KR domain in module 13 of 
Nysl lacks the conserved motif aSRrG, and thus appears 
5 to be inactive. The latter feature, together with an 
inactive DH domain in module 11, most probably account 
for the presence of a six-membered ketalic ring (between 
C13 and C17) on the nystatin molecule. 

The nysJ gene encoding a PKS, is located downstream 

10 of nysl, and is transcribed in the same direction. As 
judged from the organisation of modules in NysJ (SEQ ID 
No. 38) , the latter is required for the elongation steps 
15 to 17 in the nystatin macrolactone ring assembly. 
The DH domains in modules 16 and 17 within NysJ seem to- 

15 be inactive, and the ER domain localized in module 15 is 
most probably responsible for the reduction of a double 
bond between C8 and C9. 

The last, 18 th module in the nystatin PKS system is 
represented by the NysK protein (SEQ ID No. 3 9) encoded 

2 0 by the nysJC found downstream of nysJ. The NysK protein 
of 2066 aa is composed of KS, AT, inactive DM, ACP and 
TE domains. The NysK protein lacks a KR domain, and 
contains apparently inactive DH domain. A TE domain was 
identified at the C-terminus of NysK, suggesting that in 

2 5 addition to the condensation of the last extender unit, 

this protein also participates in the release of the 
mature nystatin polyketide chain from the PKS complex. 

The gene organisation of the cluster is shown in 
Figure 9 and Figure 8 sets out the proposed involvement 

3 0 of the various proteins encoded in the nystatin 

biosynthetic pathway. 

To confirm the involvement of nysJ, and nysJ in 
nystatin biosynthesis, these genes were disrupted in S. 
noursei via homologous recombination using the 
35 conjugative suicide vectors pKNIl, and pKNJl . The 
construction of the above vectors for disruption 
experiments was as follows. A 3.8 kb EcoRI/Ba/nHI- 
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fragment internal for the nysl, and corresponding to 
nucleotides 23711-27541 of SEQ ID No. 35, was excised 
from N64 DNA and ligated into the corresponding sites of 
plasmid pGEM3Zf(-), resulting in plasmid pL64EB3 . 8 . The 
5 s. noursei fragment cloned into pL64EB3.8 was excised 
from this plasmid with restriction enzymes EcoRI and 
Hindi I I , and ligated with the 3 . 0 kb EcoRl/tfindlll - 
fragment from the vector pSOK201, resulting in plasmid 
pKNIl . 

10 A 3.7 kb EcoRl/Ba/nHI- fragment internal for nysJ , 

and corresponding to nucleotides 43287-46992 of SEQ ID 
No. 3 5 was excised from phage N20 DNA, and ligated into 
the corresponding sites of plasmid pGEM3Zf(-), resulting 
in plasmid pL20EB3.7. The S. noursei fragment cloned 

15 into pL20EB3.7 was excised from this plasmid with 

restriction enzymes EcoRI and Hindlll, and ligated with 
the 3.0 kb EcoRl/Hindlll-f ragment from the vector 
pSOK201, resulting in plasmid pKNJl . Both pKNIl and 
pKNJl constructs were transformed into E. coli ET 12567 

2 0 (pUZ8002) , and further transferred into S. noursei ATCC 
11455 by conjugation, as described in Zotchev et al., 
(2000) Microbiology 146: 611-619. No nystatin 
production was detected in either of the pKNIl and pKNJl 
disruption mutants, thus confirming the requirement of 

25 the identified PKS ' s for nystatin biosynthesis. 

Three genes encoding proteins presumably involved 
in modification of the nystatin molecule were identified 
between nysK and nysDII. Both the nysL and nysN genes 
encode P450 monooxygenases of 394 aa and 398 aa, 

30 respectively (SEQ ID Nos . 40 and 42 respectively), that 
are probably responsible for hydroxylation of the 
nystatin polyketide moiety at C10, and oxidation of the 
methyl group at C16. Which protein is responsible for 
which reaction is not yet clear, and additional 
35 experiments are required for exact placement of NysL and 
NysN in the nystatin biosynthetic pathway. The nysM 
gene apparently encodes a ferredoxin of 64 aa (SEQ ID 
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No. 41), which presumably constitutes a part of one or 
both P4 50 monooxygenase systems, and serves as an 
electron donor [O'Keefe, D.P. & Harder, P. A. (1991). 
Occurrence and biological function of cytochrome P4 50 
5 monooxygenase s in the actinomycetes . Mol. Microbiol. 5, 
2099-2105] . 

The DNA sequences extending the region depicted on 
Figure 7 (SEQ ID No. 35) approximately 10 kb to the left 
(recombinant phage N90) , and approximately 5 kb to the 

10 right (recombinant phage N69) were determined. No genes 
with plausible functions in the nystatin biosynthesis 
were found, suggesting that the entire nystatin gene 
cluster had been identified. 

Thus, based on the complete sequence information 

15 for the nystatin biosynthetic gene cluster, the 

following genes are identified and their roles described 
as follows (see also Figures 8 and 9) : 

Table 6. Genes identified within the nystatin 
20 biosynthetic gene cluster of S. noursei ATCC 11455 



Designation Product 



Putative function 



25 nysF putative 4 ' -phosphopantheteine 



30 



35 



nysG 
nysH 
nysD3 

nysl 

nysJ 

nysK 

nysL 



transferase 
ABC transporter 
ABC transporter 
GDP - mannose -4,6- 
dehydratase 
PKS type I 

PKS type I 

PKS type I 

P450 monooxygenase 



post- translational 
PKS modification 
efflux of nystatin 
efflux of nystatin 
mycosamine 
biosynthesis 
nystatin PKS 
(modules 9-14) 
nystatin PKS 
(modules 15-17) 
nystatin PKS 
(module 18 + TE) 
hydroxylation at 



WO 01/59126 



PCT/GBO 1/00509 



86 - 



10 



15 



20 



25 



30 



nysM 

nysN 

nysD2 

nysDl 

nysA 

nysB 

nysC 

nysE 

nysRl 

nysR2 

nysR3 

nysR4 

nysRS 

ORF2 

ORF1 



f erredoxin 

P4 50 monooxygenase 

aminortansf erase 

glycosyl transferase 

PKS type I 

PKS type I 

PKS type I 

thioesterase 



transcriptional 

activator 

transcriptional 

activator 

transcriptional 

activator 

transcriptional 

activator 

transcriptional 

repressor 

transcriptional 

activator 

peptidase 



C-10 

electron transfer 
in P450 system 
oxidation of methyl 
group at C-16 
mycosamine 
biosynthesis 
attachment of 
mycosamine 
nystatin PKS 
(loading module) 
nystatin PKS 
(modules 1 and 2) 
nystatin PKS 
(modules 3-8) 
release of 
polyketide chain 
from PKS 
regulation of 
nystatin production 
regulation of 
nystatin production 
regulation of 
nystatin production 
regulation 

regulation 

regulation 

peptide metabolism 
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Example 5 - Manipulation of nysta tin biosvnthetic genes; 

Predictive examples presented below provide 
guidelines for the rational genetic manipulations of the 
5 nystatin biosynthetic genes aimed at specific chemical 
changes in the nystatin molecule. These manipulations 
are based on the current understanding of structure- 
function relashionship of the polyene antibiotics the 
number of conjugated double bonds and the presence of two 
10 ioni sable groups (exocyclic carboxyl and aminogroup 
belonging to the deoxysugar moiety mycosamine) . 

Changing the number and positions of conjugated double 
bonds . 

15 

The conjugated double bonds within the nystatin 
macrolactone ring are formed as a result of two reductive 
steps performed by a PKS modules with ketoreductase (KR) , 
and dehydratase (DH) activities. Further reduction of 
2 0 the double bond can be brought about by introducing a 
enoylreductase (ER) activity in such PKS modules. This 
shall result in the formation of a completely saturated 
bond instead of a double bond at a specific step of 
nystatin biosynthesis. The following manipulations can 

2 5 be proposed (compounds that theoretically can be produced 

as a result of these manipulations are presented on 

Figure 10) : 

insertion of ER domain into module 3 (1) 
insertion of ER domain into module 4 (2) ; 

3 0 - simultaneous inactivation of the ER domain in 

module 5 and insertion of the ER domain into module 
3 (3); 

simultaneous inactivation of the ER domain in 
module 5 and insertion of the ER domain into module 
35 4 (4); 

simultaneous inactivation of the ER domain in 
module 5 and insertion of the ER domain into module 
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7 (5); 

simultaneous inactivation 
module 5 and insertion of 

8 (6); 

5 - simultaneous inactivation 

module 5 and insertion of 

9 (7); 

simultaneous inactivation 
module 5 and insertion of 
10 modules 8 and 9 (8) ,- 

simultaneous inactivation 
module 5 and insertion of 
modules 7 and 8 (9) ; 



of the ER domain in 

the ER domain into module 

of the ER domain in 

the ER domain into module 

of the ER domain in 
the ER domains into 

of the ER domain in 
the ER domains into 



15 These manipulations can be performed using the 

techniques for gene replacement described for S. noursei 
(Sekurova et al., FEMS Microbiol Lett, 177: 297-304, 
1999) . The materials for manipulations can be provided 
by the nystatin gene cluster itself, or other PKS gene 

20 clusters. The latter could be preferential from the 
point of view of genetic stability of the recombinant 
strains . 



Removal/repositioning of the COOH group 

25 

Exocyclic carboxyl function of the polyene 
antibiotics is believed to play a crucial role in 
selective toxicity of these compounds. More 
specifically, the inter- and intramolecular interactions 

30 between the ionizable exocyclic carboxyl and amino group 
of micosamine moiety seem to be of particular importance 
(Resat, H., Sungur, F.A., Baginski, M . , Borowski, E., 
Aviyente, V. 2000. Conformational properties of 
amphotericin B amide derivatives - impact on selective 

35 toxicity. Journal of Computer-Aided Molecular Design, v. 
i4, p 689-703) . It is theoretically possible to either 
remove the exocyclic carboxyl from, or reposition it on 



WO 01/59126 



PCT/GB01/00509 



- 89 - 

the nystatin moleculevia manipulation of the nystatin 
biosynthetic genes. Nystatin derivatives produced via 
such manipulations could be useful on their own, or serve 
as substrates for further chemical modifications. 
5 The following manipulations are proposed (resulting 

compounds are represented on Figure 11) : 

replacement of methylmalonyl -specif ic 
acetyl transferase (AT) domain in module 11 of the 
nystatin PKS with malonyl- specif ic AT domain (10) ; 
10 - replacement of malonyl -specif ic AT domain in module 

12 with methylmalonyl -specif ic AT domain with 
simultaneous replacement of methylmalonyl -specif ic 
AT domain in module 11 with malonyl -specif ic AT 
domain (11) ; 

15 - replacement of malonyl -specific AT domain in module 

10 with methylmalonyl-specif ic AT domain with 
simultaneous replacement of methylmalonyl -specif ic 
AT domain in module 11 with malonyl -specif ic AT 
domain (12) ; 

2 0 - inactivation of P450 monooxygenase-encoding genes 

nysL or nysN (whichever is found to be responsible 
for oxygenation of the methyl group at C-16 on the 
nystatin molecule) (13) . 

25 It shall be noted that specificity of the P450 

monooxygenase responsible for the appearance of the 
exocyclic carboxyl function can be engineered so that it 
fulfills its function on the new substrates. Such 
methods as site-specific or random mutagenesis along with 

30 error-prone PCR and DNA shuffling might prove useful for 
this purpose. 

1.3, Introduction of additional hydroxyl functions 
(increasing water solubility) . 

35 

Polyene antibiotics are very poorly soluble in 
water mostly due to a highly hydrophobic set(s) of 



WO 01/59126 



PCT/GB01/00509 



- 90 - 

conjugated double bonds. Increasing water solubility can 
be an advantage in certain cases, as it expands 
pharmacologicxal properties of the drug (Golenser, J . , 
Frankenburg, S., Ehrenfreund, T. & Domb, A.J. 1999. 
5 Efficacious treatment of experimantal leishmaniasis with 
amphotericin B-arabinogalactan water-soluble derivatives. 
An timicrob Agents Chemother., v. 43: 2209-2214). To 
increase the water solubility of nystatin, we suggest to 
introduce additional hydroxyl functions (hydrophilic) to 
10 the nystatin molecule. The following modifications of 
the nystatin biosynthetic genes can lead to the desired 
effect (resulting compounds depicted on Figure 12) : 

inactivation of dehydratase (DH) domain in module 3 

of the nystatin PKS (14) ; 
15 - inactivation of DH domain in module 4 (15) ; 

inactivation of DH domain in module 3 with 

simultaneous inactivation of ER domain in module 5 

(16) ; 

inactivation of DH domain in module 4 with 
20 simultaneous inactivation of ER domain in module 5 

(17) ; 

inactivation of DH domain in module 7 with 
simultaneous inactivation of ER domain in module 5 

(18) ; 

2 5 - inactivation of DH domain in module 8 with 

simultaneous inactivation of ER domain in module 5 

(19) ; 

inactivation of DH domain in module 9 with 
simultaneous inactivation of ER domain in module 5 
30 (20); 

Extension, truncation or rebuilding of the macrolactone 
ring. 

3 5 Novel derivatives of a polyene antibiotic nystatin 

can be obtained also through truncation of the nystatin 
PKS, leading to derivatives with a smaller macrolactone 
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ring (as exemplified in example 2 for the ERD48 mutant). 
This can be achieved through deletion of one or more 
modules from the nystatin PKS. Such truncations can lead 
to production of polyketides with 36- to 6-membered 
5 lactone rings that potentially can be useful for further 
modifications and synthesis of novel pharmaceuticals. 
Extension of the nystatin macrolactone ring can be 
achieved through insertion of additional modules into the 
nystatin PKS. Such manipulations can also lead to 
10 production of the lead compounds for pharmaceutical 
applications . 

The nystatin molecule can be completely rebuilt by 
the way of shuffling the PKS modules between the 
nystatin, or other PKSs so that completely new 
15 derivatives are produced. In this respect, the method 

disclosed in the patent WO 00/77181 can prove useful for 
making the recombinant DNA constructs serving this 
purpose . 

Finally, the nystatin biosynthetic genes can prove 

20 useful for manipulation of other macrolide antibiotic 

biosynthetic pathways. Both PKS and modification enzymes 
can prove useful for such purposes. It seems likely that 
nystatin biosynthetic genes will be most useful for 
manipulation of other polyene antibiotic biosynthetic 

25 clusters, such as the one for pimaricin (Aparicio, J.F., 
Fouces, R., Mendes, M.V., Olivera, N. , Martin, J.F. 2000. 
A complex multienzyme system encoded by five polyketide 
synthase genes is involved in the biosynthesis of the 26- 
membered polyene macrolide pimaricin in Strepfcomyces 

30 natalensis. Chem Biol, v. 7: 895-905). High degree of 

similarity on the protein level between the nystatin and 
pimaricin biosynthetic enzymes will most probably ensure 
that their hybrids are functional. On the other hand, 
different specificities of the heterologous modification 

35 enzymes might provide new tools for further structural 
changes on the molecules produced by genetically 
engineered strains. 
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Example 6; Further manipulations of the regulatory genes 
leading to increased production of nystatin 



Expression of nysRII under control of the PermE* promoter 
5 in S. noursei ATCC 11455 

To confirm the function of the nysRII gene 
predicted through analysis of its coding sequence (see 
Example 1), it was expressed in S. noursei ATCC 11455 
under control of the PermE* promoter. The 2168 bp Sall- 

10 Sell fragment from the phage N58 (representing C-terminal 
part of nysRII) was cloned into the Sall-BamHI digested 
pGEMHZf(-), resulting in construct pClAl . The 811 bp N- 
terminal part of the nysRII gene was PCR-amplif ied from 
the phage N58 template with the oligonucleotide primers 

15 NSR2 .1 (5 ' -GCCGGCATGCGACGAA CAG GACGAGAGGT - 3 ' ) (SEQ ID 
NO. 44.) and NSR2 . 3 ( 5 1 - GCCGTGGTCGACGAA GGC-3 1 ) (SEQ ID 
NO. 45) . The conditions for PCR were as described above. 
The PCR fragment was digested with SphI and Sail, and 
ligated, together with the 2168 bp Sa2l-HindIII fragment 

20 from pClAl into Sphl-Hindlll digested pGEM3Zf (-) vector, 
giving the plasmid pC2Al . From the latter, the 3 . 0 kb 
SphI -Hindi I I fragment was isolated and ligated, together 
with the 0.3 kb EcoRI-SphI fragment containing the PermE* 
promoter either with EcoRI-tfindlll digested pSOK804 

25 vector (generating plasmid pC3Al) , or with the 3 . 0 kb 

EcoRI-Hindlll fragment from pSOK201 (generating plasmid 
pC3El) . Since the pSOK804-based vectors integrate site- 
specifically in the S. noursei chromosome, the pC3Al 
plasmid could be regarded as a construct for nysRII 

30 expression in- trans. Plasmid pC3El, on the other hand, 
is a suicide vector capable of integrating into the S. 
noursei genome only via homologous recombination through 
the cloned nysRII gene, thus providing expression of the 
latter in-cis. Introduction of plasmids pC3Al and pC3El 

3 5 resulted in recombinant strains C3A1 and C3E1, 

respectively. In the former strain PermE* promoter is 
placed upstream of both nysRII and nysRIII genes (in- 
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cis) , while in the latter strain Per/nE* promoter is 
placed upstream of nysRII gene (in- trans) . Nystatin 
production by the C3A1 and C3E1 strains was increased by 
18% and 22%, respectively, compared to the wild- type S. 
5 noursei. Moreover, during fermentation experiments it 
was noticed that nystatin production by the C3E1 strain 
reached its maximum 24 h earlier than the wild-type 
strain. These data support the assumption that the 
nysRII gene encodes a positive regulator that may be used 
10 for enhancing the production of nystatin and its 
derivatives in S. noursei strains. 

Expression of nysRIV under control of the VermE* promoter 
in S. noursei ATCC 11455: 

15 The start codon for nysRIV was reassigned, and is 

likely to be located 48 nt upstream of the originally 
proposed start nucleotide. Thus, nysRIV presumably 
encodes a 226 aa (long) rather than a 210 aa (short) 
protein as was previously suggested. The long and short 

20 versions of the nysRIV gene were PCR-amplif ied from the 
N5 8 recombinant phage DNA with oligonucleotide primers 
NR4P3 (5 - CTCAGCATGCCGAAAGGATGGCGG-3 1 ) (SEQ ID NO. 46) 
and NR4P5 (5 ' -AGGCAAGCTTCGGCGACACGGGCGT- 3 1 ) (SEQ ID NO. 
47), or NR4P4 (5 ' -CTCAGCATGCGTACGACCGGCGGG-3 1 ) (SEQ ID 

25 NO. 48) and NR4P5 , respectively. The conditions for PCR 
were as described above. The corresponding PCR products 
of 0.78 kb and 0.73 kb were digested with SphI and 
tfindlll, and ligated, together with the 0.3 kb EcoRI-SphI 
fragment containing the PermE* promoter, with the EcoRI- 

30 tfindlll digested pSOK804 vector, resulting in plasmids 
pNR4EL and pNR4ES, respectively. Both plasmids were 
introduced into S. noursei ATCC 11455 generating mutant 
strains NR4EL and NR4ES, respectively. Nystatin 
production by the strain NR4ES (expressing a 210 aa 

35 protein from the PermE* promoter) did not differ 

significantly from that of the wild-type 5. noursei 
harboring only pSOK804, while the NR4EL recombinant 
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cis) , while in the latter strain PermE* promoter is 
placed upstream of nysRII gene (in- trans). Nystatin 
production by the C3A1 and C3E1 strains was increased by 
18% and 22%, respectively, compared to the wild-type S. 
noursei. Moreover, during fermentation experiments it 
was noticed that nystatin production by the C3E1 strain 
reached its maximum 24 h earlier than the wild-type 
strain. These data support the assumption that the 
nysRII gene encodes a positive regulator that may be used 
for enhancing the production of nystatin and its 
derivatives in S. noursei strains. 



Expression of nysRIV under control of the PermE* promoter 
in S. noursei ATCC 11455: 
15 The start codon for nysRIV was reassigned, and is 

likely to be located 4 8 nt upstream of the originally 
proposed start nucleotide. Thus, nysRIV presumably 
encodes a 226 aa (long) rather than a 210 aa (short) 
protein as was previously suggested. The long and short 
20 versions of the nysRIV gene were PCR-amplif ied from the 
N58 recombinant phage DNA with oligonucleotide primers 
NR4P3 (5 - CTCAGCATGCCGAAAGGATGGCGG-3 1 ) (SEQ ID NO. 46) 
and NR4P5 (5 1 -AGGCAAGCTTCGGCGACACGGGCGT-3 1 ) (SEQ ID NO. 
47), or NR4P4 (5 1 -CTCAGCATGCGTACGACCGGCGGG-3 1 ) (SEQ ID 
25 NO. 48) and NR4P5, respectively. The conditions for PCR 
were as described above. The corresponding PCR products 
of 0.78 kb and 0.73 kb were digested with SphI and 
Hindlll, and ligated, together with the 0.3 kb EcoRl-SphI 
fragment containing the PermE* promoter, with the EcoRI- 
30 Hindlll digested pSOK804 vector, resulting in plasmids 
pNR4EL and pNR4ES, respectively. Both plasmids were 
introduced into S. noursei ATCC 11455 generating mutant 
strains NR4EL and NR4ES, respectively. Nystatin 
production by the strain NR4ES (expressing a 210 aa 
35 protein from the PermE* promoter) did not differ 

significantly from that of the wild-type S. noursei 
harboring only pSOK804, while the NR4EL recombinant 
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(expressing a 226 aa protein from the PermE* promoter) 
produced nystatin at a level 36 % above the wild-type 
level. These data suggest that the longer, 226 aa - long 
version, represents the actual NysRIV polypeptide. 
5 Moreover, these data support the assumption that nysRIV 
gene encodes a positive regulator that may be used for 
enhancing the production of nystatin and its derivatives 
in S. noursei strains. 
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Claims 

1. A nucleic acid molecule comprising: 

(a) a nucleotide sequence as shown in SEQ ID No. 35; 
5 or 

(b) a nucleotide sequence which is the complement of 
SEQ ID No. 35; or 

(c) a nucleotide sequence which is degenerate with SEQ 
ID No. 35; or 

10 (d) a nucleotide sequence hybridising under conditions 
of high stringency to SEQ ID No. 35, to the complement 
of SEQ ID No. 35, or to a hybridisation probe derived 
from SEQ ID No. 3 5 or the complement thereof; or 

(e) a nucleotide sequence having at least 80% sequence 
15 identity with SEQ ID No. 35; or 

(f) a nucleotide sequence having at least 65% sequence 
identity with SEQ ID No. 35 wherein said sequence 
preferably encodes or is complementary to a sequence 
encoding a nystatin PKS enzyme or a part thereof. 

20 

2. A nucleic acid molecule as claimed in claim 1, 
said molecule comprising: 

(a) a nucleotide sequence as shown in SEQ ID No. 1 
and/or in SEQ ID No. 2; or 
25 (b) a nucleotide sequence which is the complement of 
SEQ ID No. 1 and/or SEQ ID No. 2; or 

(c) a nucleotide sequence which is degenerate with SEQ 
ID No. 1 and/or SEQ ID No. 2; or 

(d) a nucleotide sequence hybridising under conditions 
30 of high stringency to SEQ ID No. 1 and/or SEQ ID No. 2, 

to the complement of SEQ ID No. 1 and/or SEQ ID No. 2, 
or to a hybridisation probe derived from SEQ ID Nos . 1 
and/or 2 or the complements thereof; or 

(e) a nucleotide sequence having at least 65% sequence 
35 identity with SEQ ID No. 1 and/or SEQ ID No. 2, wherein 

said sequence preferably encodes or is complementary to 
a sequence encoding a nystatin PKS enzyme or a part 
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thereof . 

3 . A nucleic acid molecule as claimed in claim 1 or 
claim 2 which comprises a nucleotide sequence having at 
5 least 70% sequence identity with SEQ ID No. 35, or SEQ 
ID No. 1 and/or SEQ ID No. 2, wherein said sequence 
preferably encodes or is complementary to a sequence 
encoding a nystatin PKS enzyme or a part thereof. 

10 4. A nucleic acid molecule comprising a part of a 
nucleotide sequence as defined in claim 1 or claim 2, 
wherein said part is at least 15 nucleotides in length. 

5. A nucleic acid molecule as defined claimed in 
15 claim 1, which does not comprise the nucleotide 

sequence encoding ORF 1 as set out in Table 1. 

6. A nucleic acid molecule as defined claimed in 
claim 1, which does not comprise the nucleotide 

2 0 sequence encoding ORF 2 as set out in Table 1. 

7. A nucleic acid molecule as claimed in any one of 
claims 1 to 6 which encodes one or more polypeptides, 
or comprises one or more genetic elements, having 

25 functional activity in the synthesis of a macrolide 
antibiotic or a polyketide moiety. 

8. A nucleic acid molecule as claimed in claim 7, 
wherein said macrolide antibiotic or polyketide moiety 

30 is nystatin or a nystatin derivative, 

9. A nucleic acid molecule as claimed in any one of 
claims 1 to 6, said molecule comprising one or more 
genes, and/or one or more regulatory sequences, and/pr 

35 one or more modules, or enzymatic domains, or non- 
coding or coding functional genetic elements of a PKS 
gene cluster. 
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10. A nucleic acid molecule as claimed in any one of 
claims 3 to 9, comprising a nucleotide sequence defined 
by any one or more of the nucleotide "start" and "end" 
positions within SEQ ID Nos. 35, 1 or 2, as set out in 
Table 1. 

11. A nucleic acid molecule as claimed in any one of 
claims 3 to 9 , comprising a nucleotide sequence defined 
as lying between any one or more of the nucleotide 
sequences having the "start" and "end" positions within 
SEQ ID Nos. 35, 1 or 2, as set out in Table 1. 



12. A nucleic acid molecule comprising a nucleotide 
sequence encoding one or more amino acid sequences 

15 selected from SEQ ID Nos 3 to 20 or 36 to 43, or a 

nucleotide sequence which is complementary thereto or 
degenerate therewith or comprising a nucleotide 
sequence which encodes one or more amino acid sequences 
which exhibit at least 60% sequence identity with any 

20 one of SEQ ID Nos. 3 to 20 or 36 to 43. 

13. A nucleic acid molecule as claimed in claim 12 
which encodes one or more amino acid sequences which 
exhibit at least 85% sequence identity with any one of 

25 SEQ ID Nos. 3 to 20 or 36 to 43. 

14 . A polypeptide encoded by a nucleic acid molecule 
as defined in any one of claims 1 to 13 . 



15. A polypeptide as claimed in claim 14, comprising: 

(a) all or part of an amino acid sequence as shown in 
any one or more of SEQ ID Nos. 3 to 20 or 36 to 43; or 

(b) all or part of an amino acid sequence which has at 
least 60% sequence identity with any one or more of SEQ 

35 ID Nos. 3 to 20 or 36 to 43. 
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16. A polypeptide as claimed in claim 14, wherein said 
amino acid sequence at (b) has at least 85% sequence 
identity with any one or more of SEQ ID Nos . 3 to 20 or 
36 to 43. 

5 

17. A polypeptide as claimed in any one of claims 14 
to 16 or claim 11 having functional activity in the 
synthesis of a macrolide antibiotic or polyketide 
moiety. 

10 

18 . A polypeptide as claimed in any one of claims 14 
to 17, comprising a functional region of any one of SEQ 
Nos. 3 to 20 or 36 to 43, said functional region being 
as defined in Table 2. 

15 

19. An expression vector comprising a nucleic acid 
molecule as defined in any one of claims 1 to 13. 

20. A host cell or transgenic organism comprising a 

20 nucleic acid molecule as defined in any one of claims 1 
to 13 or an expression vector as defined in claim 19. 

21. A method for producing a polyketide or macrolide 
molecule, said method comprising expressing within a 

25 host cell a nucleic acid molecule as defined in any one 
of claims 1 to 13. 

22. Use of a nucleic acid molecule as defined in any 
one of claims 1 to 13 in the preparation of a modified 

3 0 PKS system, or in the preparation of a modified 
polyketide molecule . 

23. A method for preparing a modified PKS system, or a 
modified polyketide molecule, said method comprising 

35 modifying a nucleic acid molecule as defined in any one 
of claims 1 to 13 or introducing a nucleic acid 
molecule as defined in any one of claims 1 to 13 into a 
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further PKS-encoding nucleic acid molecule. 

24. A method as claimed in claim 23, being a method of 
preparing a nucleic acid molecule which comprises a 

5 nucleotide sequence encoding a modified polyketide 
synthase enzyme or enzyme system, said method 
comprising (a) using a nucleic acid molecule as 

defined in any one of claims 1 to 13 as a scaffold and 
modifying one or more portions of the nucleic acid 
10 molecule that encode enzymatic or other functional 
activities; or 

(b) introducing one or more portions of the nucleotide 
sequence that encode enzymatic or other functional 
activities into an alternative (i.e. different 
15 "second") PKS scaffold. 

25. A method as claimed in claim 23, being a method of 
preparing a nucleic acid molecule comprising a 
nucleotide sequence encoding a modified nystatin PKS, 

20 wherein said modified nystatin PKS is derived from a 
nystatin PKS encoded by a nucleic acid molecule as 
defined in any one of claims 1 to 13, said nucleic acid 
molecule containing first regions which encode 
enzymatic or other functional activities and second 

25 regions which encode scaffolding amino acid sequences, 
said method comprising: 

(a) modifying at least one said first region; or 

(b) incorporating at least one said first region into 
a scaffolding-encoding second region from a different 

30 PKS-encoding nucleotide sequence. 

26. A method of preparing a modified nystatin PKS as 
defined in claim 25, said method comprising expressing 
a nucleic acid molecule prepared as defined in claim 2 5 

3 5 within a host cell under conditions whereby the 
modified nystatin PKS is expressed. 
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27. A modified PKS system derived from a nucleic acid 
molecule as defined in any one of claims 1 to 13 by a 
method as defined in any one of claims 23 to 26. 

5 28. A modified PKS system as claimed in claim 27, 

comprising a modified nystatin PKS wherein one or more 
of the polypeptides of SEQ ID Nos. 3 to 20 or 36 to 43, 
or one or more of the enzymatic domains in one or more 
modules of SEQ ID Nos. 5 to 7, 20, or 37 to 39 are 
10 modified. 

29. A modified PKS system as claimed in claim 27, 
comprising a modified nystatin PKS wherein 

(i) the ER domain in module 5 of Nys C (amino 
15 acids 4953-5239 of SEQ ID No. 7) is inactivated; or 

(ii) a module of Nys C (SEQ ID No. 7) is deleted; 

or 

(iii) the DH domain in module 8 of Nys C (amino 
acids 10086-10289 of SEQ ID No. 7) is inactivated; or 

2 0 (iv) the KR domain in module 7 of Nys C (amino 

acids 8812-9086 of SEQ ID No. 7) is inactivated. 

30. A multiplicity of cell colonies comprising a 
library of colonies wherein each colony of the library 

25 contains a different modified PKS as defined in any one 
of claims 27 to 29. 

31- A method of producing a library of PKS complexes 
or a library of polyketide molecules, said method 

3 0 comprising culturing a library of colonies as defined 

in claim 30. 

32. A library of PKS complexes or a library of 
polyketide molecules obtained or obtainable by a method 
35 as defined in claim 31. 
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33. A combinatorial library of polyketides wherein the 
polyketide members of the library are synthesized by a 
modified PKS system as defined in any one of claims 27 
to 29. 

5 

34. A host cell or transgenic organism containing (a) 
a nucleic acid molecule prepared by a method as defined 
in any one of claims 23 to 25 or (b) a modified PKS 
system as defined in any one of claims 27 to 29. 

10 

35. A polyketide produced or producible by a host cell 
or transgenic organism as defined in claim 34. 

36. An antibiotic derived from a polyketide as defined 
15 in claim 35. 
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SEQUENCE LISTING 

SEQ ID NO. 1 
5 DNA sequence: (Nysl; region 1) 

1 gatccggccg gtgatgaact cctgctcctg tgcggtcagt ccggtgtgcg 





51 


tcggcagata 


gaagccgtcg 


gcgctgagcc 


gggccgcgtt 


caacgcgggc 




101 


cagtcggcgg 


agaagtaacc 


gggctgacgg 


ctcatcggct 


tgaagaagac 


10 


151 


ccgggtctcg 


atgccctcgc 


cggcgaggta 


cgcgcacagc 


tcgtcccgcc 




201 


gctcggcccg 


caggtcgtac 


atccacagca 


cgtcgcgcgg 


cggcatcagc 




251 


gtgatgccgg 


ggatgtcgcg 


cagcgcctcg 


tcgtagcgct 


tctcgatgtc 




301 


gcggcgcagg 


gcgaggatgg 


tgtccagctg 


ctcggtctgc gcgagcgcca 




351 


ccgcggcctg 


catgttggtc 


atccggaagt 


tgtaggccag 


cttcttgtgc 


15 


401 


aggaagctgt 


ggtccttggt 


gaacgccatc 


gcccgcagat 


gggccatctg 




451 


ctcggccagg 


tgcgggtcgt 


gggtcaggca 


gacgccgccc 


tcgccggccg 




501 


agatgatctt 


gttggcgaag 


agcgagaaac 


aggcgatgtc 


gccccgcggc 




551 


cgcaccccgt gcgcctcggc ggagtcctcc accacccgca 


ggttgtactc 




601 


gtacgccagg 


ttcagcacgg 


cgtccatgtc 


gcactgccgg 


ccgtagatgt 


20 


651 


gcaccggcat gatcactttg gtgcgcgggg 


tgatcttctc 


ctcgatgcgg 




701 


gacacgtcga 


tgttcaggtc 


gtcgccgcag 


tccacgaaca 


ccggcgtggc 




751 


accggtgtag 


gtcaccgccc 


aggcggacgc 


gatcatcgtg 


aactccggga 




801 


cgatcacctc 


gtcaccgggg 


ccgacgccca 


gcgcgcgcag 


cgccagcgtc 




851 


agcgccgtgg 


tgccggagga 


gcaggcgacg 


ccgaacggca 


cgtcgttgta 


25 


901 


cgcggcgaac 


gcctcctcga 


accgcctgac 


gtacggcccc 


tgcgaagaga 




951 


tccagccgcc 


gccgacggcc 


tccgtcacat 


agtcgagctc 


gcggccctgg 




1001 


agccacggca 


tggacaccgg 


atacgtaaag 


gacatgggtt 


ttgagtcttc 




1051 


ctcggtcagt 


cggttgccag 


ggcgggaagg 


ccgagcagca 


ggtccgccgc 




1101 


cgcggcccgg 


ccgccggcgt 


cccgcagcag 


cccggcgaag 


tgctccgcgc 


30 


1151 


gctcggtgaa 


ggagggttgg 


tcgagcacgc 


gggtgatctt gtccaggacg 




1201 


tcctcggtgt 


ccacggtctc 


cggccggtcc 


agggtcaggc 


tcaccccgaa 




1251 


gtcctggccc 


cggatcgcct 


ggtcgtcgca 


gtccacccac 


aacggccgga 




1301 


ccaccagcgg 


ctttccgaag 


tacaggccct 


cgtggtagcc 


gttgccaccg 




1351 


gcatgggtga 


agaacgcctt 


cacgttcgga 


tgggccagca 


cgtccagctg 


35 


1401 


cgacggcacc 


cagccctcga 


tccgcaggtt 


gtccggcagc 


tcggcggccg 




1451 


gcggcagcaa 


ctcctgttgg 


ccgcgcggga 


gtttccacaa 


cacctggtgg 




1501 


ccccggccgt 


ccagtcgccg 


ggcgacctcc 


accagcgacg 


ccacctgctc 
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1551 


acgggtcagc 


cgggtgatcg 


tgccgaagcc 


catgtacacc 


acggacttct 




1601 


gcgccgacag 


ccagtccgac 


aggccgtcgt 


cgtccggtgc 


ctggggcagc 




1651 


ggcggcacca 


tcgtgcccac 


cagccgcagc 


ttcggatgca 


tcgggaacgg 




1701 


gtagtccaac 


tcccttacgg 


agtagcacaa 


gacctgctcc 


gcatggtcga 


5 


1751 


tccgcgccat 


catctgccgc 


gcctggggcg 


cgatgcccag 


ctcggtgcgg 




1801 


acccggttgt 


cctcctcgac 


gaccttgcgg 


acgtccgacg 


tcaggaacat 




1851 


cccgagcgtc 


cgcagccgga 


acagctggtt 


ctcgatccgc 


tgagccaggg 




1901 


acatcgcggc 


cggcagcccc 


gagtgcggca 


ccgggaaacc 


cgacggggtg 




1951 


taggacttgg 


cgaacgggac 


gtgcgaggtg 


aggacgttgc 


tcggcacgaa 


10 


2001 


cggcaccccg 


agcacgaacg 


gaatgccctt 


ggtgatcgcc 


agctcgtacc 




2051 


cgaactggca 


catgctctcg 


atcaccatca 


gcgccggctc 


gacctcctcg 




2101 


acgatctcct 


ccaggcggcg 


gtacttcgcc 


atccgcgact 


ccggcgcgaa 




2151 


cgaatgccga 


atcaccgccg 


cgtgcgcctt 


gaaccgcgac 


cgctgcgtca 




2201 


cctccgcata 


cgtcgcgtcg 


tcccatgtga 


ccgccgacat 


ctgcgagacg 


15 


2251 


gtgtcgccga 


gcgacgcgaa 


ccgaaccggg 


ctgccgtcca 


ccacggccgc 




2301 


cacctcgtcg 


cgcgctttct 


cgtcggtggc 


gaaccacagg 


tccgccacgt 




2351 


cgcgccggga 


caattccccg 


gccagcacga 


gcagcggatt 


gagcaggccg 




2401 


ctttcggcat 


aactgacgaa 


caggatcggc 


cgccgattcg 


cgcccatgga 




2451 


caacacccct 


cggaatgtgg 


cgggccgccg 


ggcccgcgcg 


ccacgcaccc 


20 


2501 


gcccggcccg 


gtcgccgggt 


gagtgcattc 


gccgacgccg 


ccacccgagg 




2551 


cgcgtgttgc 


cggaaggaag 


ggtcaccggc 


cggcacccgg 


aacgcgccgc 




2601 


gtggaaaacg 


ggtcggttac 


ttggtctcat 


gccacggacc 


ggggaatcac 




2651 


tagtcttcgg 


cgcgcgacgg 


ccctttccgg 


gccgtgtggc 


caatgcccgt 




2701 


ccccggcgcc 


cgtcattcct 


tagggaaaag 


tacagcgttt 


gcgaacgtac 


25 


2751 


gatccggcac 


gcagaggtga 


cctgaggcca 


acttttccgc 


aggggtgagc 




2801 


aaggcatgac 


gatcggagcc 


gacgaggacc 


cggtggtggt 


cgtcggaatg 




2851 


gcctgccgtt 


atccgggtgg 


ggtcgccggc 


ccggaggacc 


tgtgggaact 




2901 


ggtccgcacc 


ggccgcgacg 


cgaccaccgc 


cttcccggac 


gaccgcggct 




2951 


gggacctggc 


cgcactggcc 


ggcgacggac 


ccggccgcag 


cgcgacccgc 


30 


3001 


gagggcggat 


tcctcaccgg 


cgccgccgac 


ttcgacgccg 


ccttcttcgg 




3051 


catgtcgccc 


cgcgaggccg 


tctccaccga 


cccgcaacag 


cgcctcgtcc 




3101 


tggagaccgc 


ctgggaagcc 


ctggagcgcg 


ccggcatcga 


cccgcactcc 




3151 


ctgcgcggca 


gccgcaccgg 


ggtcttcgtc 


ggcgccagcg 


gccaggacta 




3201 


cgccgccgtc 


acccacgcct 


cgcccgacga 


cctggacgga 


cacgccctca 


35 


3251 


ccggcctggc 


ccccggcgtc 


gcctccggtc 


gcctggcgta 


cgtcctgggc 




3301 


ctcgaaggcc 


ccgccgtcac 


cgtcgacacc 


acgtcctcct 


cgtcgctggt 




3351 


cgcgctgcac 


tgggcggtcc 


gcgccctgcg 


cgcgggggag 


tgcagcaccg 
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34 01 ccctggccgg cggcgtcacg 

34 51 cacacccgac agggcggcct 

3501 cgacgacgcc gacggcaccg 

3551 tggagcacct gtccaccgcc 

5 3601 ctgcgcggct cggccgtcaa 

3651 acccagcggt cccgcccagg 

3701 cccgactcgc ccccgccgac 

3751 acccggctcg gcgaccccgt 

3801 ccaggaccgg gacccggacc 

10. 3851 ccctcggcca cgcacaggcc 

3 901 gtcctgaccc tgcggcacgg 
3951 ccccacccgc caagtcgact 
4001 accacacgcc ctggccaccg 
4051 tccttcggca tcagcggcac 

15 4101 gcccgccgac gtccccgtca 

4151 tcccctggcc ggtctccgcc 

4201 gcccggctcc gcgcccacct 

4251 cgtcggctac tccctggcca 

4301 tcctcctgcc gcccgccgac 

20 4351 cgcggtgcgg cccaccagcg 

4401 cagccagcgc ccgggcatgg 

4451 tcgccgacgc actggacgac 

4501 ggcccggtgc gcgaggtgat 

4551 gaccggctgg acccagcccg 

25 4601 gcctggtcgc gtccctcggc 

4 651 gtcggcgaga tcgccgccgc 
4 701 cgcctgccgc ctggtggccg 
4751 ccggcggcgc gatggccgcg 
4801 ctgctcggcg cacacctcgc 

3 0 4 851 cgtcgtcgcc ggagccgagg 

4 901 ccgaccgcgg ccggcgcacc 

4951 tcgccgctga tggagcccat 

5001 actgaccttc caccagccgt 

5051 aactcgccgg cagtgagatc 

3 5 5101 cgcgacaccg tccgcttcgc 

5151 cgccgacgtc ctgatcgaac 

5201 cccgcgacac cctcggcccc 



gtgatgtcca ccccggccgc cttcgtcggc 
cgcgcccgac ggccgctgca agccgttctc 
cctgggcgga gggcgtcggc atcgtcgtcc 
cgcgccgccg gcaaccccgt cctcgccgtg 
ccaggacggc gcctccgacg gcctcaccgc 
aacgcgtcat ccgcgccgcc ctcgccgacg 
atcgatctcg tcgaggcgca cggcaccggc 
cgaggcccgg gcgctgctcg ccgcctacgg 
gaccgctgcg cctcggttcc ctgaagtcca 
gccgccggca tcggcggagt gatcaagacc 
cctgatgccg cgcatccggc acctggccac 
ggtcccaggg cgccgtggcc cccctcaccg 
gccgaccgac cgcgccgcgc cggcgtctcc 
caacgcccat gtgatcctcg aagaggcgcc 
cccggcccgg caccctccgc cccagcaccg 
gccacgcccg aagccctcga cgcccaactc 
gcgcacccac tcggacctgg acccgctgga 
ccggccgcgc cgcgctccgc caccgggcgg 
ggcaccgccg cggacgccgt cgagcacgcc 
ccgcaccgcc gtcctcttct ccggccaggg 
gccgcgaact cgccgcccgc ttcccggtgt 
gcgctgcgcg ccctggaccg gcacccggac 
gtggggcacc gacgccgcgc tcctggaccg 
ccctgttcgc cgtcgaggtc gccctccacc 
gtcacccccg acttcgtcgg cggccactcc 
ccacgtcgcc ggcgtcctga gcctggagga 
cccgcgccac gctgatgcag gcgctcccgg 
ctggaggcca ccgaggacga agtggccccg 
gctggccgcc gtcaacggcc ccaccgcggt 
acgccgtgcg gcaactgacc gcccgcttcg 
agccggctgg ccgtctcgca cgccttccac 
gctcgacgcc ttccgggacg tcgtgagccg 
cgatcccgct ggtctccaac ctcaccggtg 
accagcgccg agtactgggt ccggcacgtc 
cgacggcatc accgcactgg ccaaggccgg 
tcggccccgg cggcgtgctg tccgcgatgg 
gacagcacca ccgacgtcgt ccccgccctg 



SUBSTITUTE SHEET (RULE 26) 



WO 01/59126 



PCT/GB01/00509 



4 





5251 


agcaagggac 


ggcccgagga 


gaccgccttc 


gccggcgccc 


tcggccgcct 




5301 


gcacaccctc 


ggcgtccccg 


tcgactggcc 


cgccttctac 


gccggcaccg 




5351 


gcgcccgccg 


cgtcgaactg 


cccacctacg 


ccttccagca 


cgtgcgccac 




5401 


tggcccaccc 


cgccccgccc 


gaacggcgcc 


gggcccggcg 


ccctcggcca 


5 


5451 


ccccctgctc 


ggctccgccg 


tcgaactcgc 


cgacggcggc 


ggcaccgtct 




5501 


gctccggcgc 


cctctccctc 


cgcacccacc 


cctggctcgc 


cgaccacacc 




5551 


gtcgccgggc 


gggtcgtgct 


gccggccacc 


gcgctgctgg 


aactcgccgt 




5601 


gcgcgccggc 


gacgaggcgg 


gctgcgacgt 


cctgcacgaa 


ctccacctca 




5651 


ccaccccgcc 


ggccctgccc 


gacgacgccg 


ccctgcacgt 


ccaggtgcac 


10 


5701 


gtcggccccg 


ccgacaccac 


cgggcgccgc 


gccgtcaccg 


tccacacccg 




5751 


ccccgaccac 


cacccggccg 


gcgactggac 


ccgatgcgcc 


accggcaccc 




5801 


tcggcagcac 


cccgccgtcc 


gcagccgaag 


ccgccacggg 


cggcaccccg 




5851 


gccgcctggc 


cgccggccga 


cgccgaaccc 


ctcgacctcg 


ccgaccacta 




5901 


cgaacggctc 


gccgaccgcg 


gcttcgacta 


cggcccgacc 


ttccgcggcc 


15 


5951 


tgcgggccgc 


ctggcgacgc 


ggcgcggaga 


tcttcgcgga 


cgtggaatgc 




6001 


ccgcccggca 


ccgccgacga 


cgcccccgac 


cacggactgc 


accccgccct 




6051 


gctcgacgcg 


gcccggcacg 


ccgccatggc 


ggtggacggc 


accgtgcccg 




6101 


tcgcctggca 


cggcgtccgg 


ctgcacgccg 


tcggcgccac 


cgcgctgcgg 




6151 


gtccgcatcc 


gccccaccac 


gaccggcacg 


ctgaccctca 


ccgcggtcga 


20 


6201 


cgtgcacggc 


gcgccggtcg 


tcaccgtcga 


ggccctcacc 


gcccgcccgc 




6251 


tgaccgacga 


ggaacgcgcc 


gccccgcgga 


cgccgcggca 


ggcccgcggc 




6301 


gagacgcccg 


ccgacgcccg 


cccggcccgg 


cccgcggcgg 


cccgccccgg 




6351 


cccggccggc 


gaacccctcc 


cggacaccac 


cgggtcccac 


cccaccgccg 




6401 


gccacctcgc 


cgcgctgccg 


ccggccgccc 


gggagcgcca 


gctgctggac 


25 


6451 


ctggtgcgca 


cccaggccgc 


cgccgtcctg 


ggccaccccg 


gccccgaggc 




6501 


cgtcggcacc 


cgcagcgtct 


tcaaggagct 


gggcttcgac 


tcgttggccg 




6551 


gcgtcgaact 


cgccgaccgg 


ctcaccgccc 


gcaccggact 


gcgcctgccg 




6601 


gccaccctcg 


tcttcaactt 


ccccaccccc 


gaacgtgctg 


cccaccgcct 




6651 


cggggaactc 


ctcgccgcaa 


ccgcccccct 


cgaccccggg 


gcgtacggag 


30 


6701 


aggaactcac 


caggttcgag 


gcgatcgtga 


cgaacctgcc 


gcaggacggc 




6751 


cccgaacgcc 


gggccgtcgc 


ggaccggttg 


gacgccatcg 


tctccgcact 




6801 


ccgccagaac 


tcgcctgcag 


aggtgccctc 


ctcggacgag 


gacatcgaca 




6851 


cggtgtcggt 


cgacagactg 


ctcgacatca 


tcgatgaaga 


gttcgaaacc 




6901 


acatagagaa 


attgttgctt 


tcgttcgcga 


cccgatgacg 


aggacggacc 


35 


6951 


gatgcaggaa 


ccccagcaag 


gccagccgga 


ccagcaggag 


aaaatcgtcg 




7001 


actatctccg 


gcgggtcact 


tcagatcttc 


gccgtgcccg 


ccgccgcatt 




7051 


ggcgaactgg 


aatccaagga 


caacgagccc 


atcgccatcg 


tcggaatggg 
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7101 


ctgccgactt 


cccggcggcg 


tcaat tcgcc 


ggaatccctg 


tgggacctgg 




7151 


tgcgttccgg 


cggcgacgcc 


at t tccggat 


cccccg ccga 


ccgcggctgg 




7201 


gacctggaaa 


ccctcaccgg 


aaacggcgac 


ggcagcagcg 


ccacccacga 




7251 


aggcggattc 


ctctacgacg 


ccgcggaatt 


cgacgccgcc 


ccctccggca 




7301 


tctcgccgcg 


cgaggcgact 


gctatggacc 


cccagcagcg 


cctcc t cctc 




73 51 


gaagtcgcct 


999 a 99 c 9Ct 


ggagcgcgcc 


ggcatcgccc 


ccacagccct 




74 01 


gcgcggcagc 


cggtccggcg 


cguccgt-cgg 


CCCCUdCCaC 


tggggcgcgc 




74 51 


cctcggccga 


cgccgccacc 


gaactgcacg 


gccacgccct 


gaccggcacc 




7 5 01 


gccgccagcg 


tgctgtccgg 


ccgcc tggcc 


cacacccccg 


gcc tcgaagg 


1 U 


7551 


cccggccgtc 


accgtcgaca 


ccgcc tgctc 


ctcctccctc 


gtcgccctgc 




76 01 


acctggcggc 


ccagtccctg 


cgcgtcggcg 


aatcctcgct 


cgccgtgatc 




7651 


ggcggcgtca 


cgatcctcac 


cgagccgtcc 


gtcctcgtcg 


agt tcagcgc 




7701 


ccagggcggc 


ctggcaccgg 


acggccgctg 


caaggcgttc 


tccgacgccg 




7751 


ccgacggcac 


cggttgggcc 


gagggcgtcg 


gtgtcctcgt 


cgccgagcgg 


lb 


7801 


ctctccgacg 


cgcagcgcaa 


cggccatccg 


gtcctcgccg 


tgctgcgcgg 




7851 


ctccgccgtc 


aaccaggacg 


gcgcctccaa 


cggcctgacc 


gcccccaacg 




7901 


gcccctccca 


ggaacgggtc 


atccagcagg 


ccctcgcccg 


gaccggcctg 




7951 


acccccgccg 


acatcgacgc 


cgtcgaggcg 


cacggcaccg 


gcacccggct 




8001 


cggcgacccc 


atcgaggccc 


aggccctgct 


cgccacctac 


ggccagggac 


20 


8051 


acacccccga 


ccagccgctg 


tggctcggct 


ccctgaagtc 


caacatcggg 




8101 


cacacccagg 


cggccgccgg 


cgtcgccggt 


gtcatcaaga 


tggtcatggc 




8151 


gctgcgccac 


ggccacctgc 


cgccgaccct 


gcacgccgac 


gcgccctcct 




8201 


cgcacgtgga 


ctggtccgcc 


ggatcggtac 


gcctgctgac 


cgagggccag 




8251 


cagtggccgg 


agaccggacg 


tccgcgccgg 


gccgcggtgt 


cctcgttcgg 


2. b 


8301 


catcagcggc 


accaacgcgc 


acgccctgct 


ggaacaggca 


ccccaccccg 




8351 


cggacaccgc 


ggacgccggc 


gacgacgccg 


cgcccaccga 


accggccggc 




8401 


gcgcccgccg 


cgctgccctg 


gatcgtctcc 


ggacactccc 


cgcaggcgct 




8451 


gcgcgaccag 


gccgccgccc 


tggccgccag 


ggtcgagacc 


gaccccgcgc 




8501 


tccgccccca 


ggacatcggg 


cacaccctgc 


acaccgcccg 


cgccctgctc 


30 


8551 


gaacgacgcg 


ccgtcgtcgt 


cgcccccgac 


cgcgccgaac 


tcctcgcggc 




8601 


tacccacgag 


ttggccgccg 


gccggtccgc 


gaacgccgtc 


gtcgagggcc 




8651 


tcgcggacgt 


cgagggtcgg 


acggtgttcg 


tgttccccgg 


tcagggttcg 




8701 


cagtgggtgg 


ggatgggggc 


ccaactcctc 


gatgagtcgg 


cggtgttcgc 




8751 


ggagcggatt 


gccgagtgtg 


cggcggcact 


cgccgagttc 


accgactggt 


35 


8801 


cgctggtcga 


tgtgctgcgg 


ggtgtggtgg 


gtgcgccgtc 


gttggagcgg 




8851 


gtcgatgtgg 


tgcagccggc 


gtcgttcgcg 


gtgatggtgt 


cgttggctgc 




8901 


gttgtggggt 


tcccgtggtg 


tgttgccgga 


tgcggtggtg 


gggcattcgc 
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8951 


agggtgagat 


cgctgccgcg 




9001 


ggggcgcggg 


tggtggcgct 




9051 


ggggcggggc 


gggatgatgt 




9101 


cgcggttggt 


cgagttcgag 


5 


9151 


ccgcgctccg 


tcgtggtcgc 




9201 


cgcccggctg 


accgccgacg 




9251 


acgcctcgca 


ctcgcaccag 




9301 


gtgctggcgg 


agctggcgcc 




9351 


cgtgaccggc 


gactggctgg 


10 


9401 


tccgcaacct 


gcgcggacgg 




9451 


ctggcggcgg 


agtaccgcgc 




9501 


gacgatggcg 


gtcttggacc 




9551 


cgaccggcac 


cctgcgccgt 




9601 


tcggccgccg 


aggtcttcgt 


15 


9651 


gttcgagggg 


accggtgcgg 




9701 


agcgcgagcg 


gtactggaac 




9751 


gacgccccga 


tggacgccga 




9801 


ctccgcgctg 


accgccgcgc 




9851 


tcctgcccgg 


cctcacctcc 


20 


9901 


ctcgactcct 


ggcgctaccg 




9951 


ccgcgccacc 


ctgaccggca 




10001 


acgacaccga 


tgtggcaggg 




10051 


cggctggtcc 


tggacgagga 




10101 


gctggccggc 


gcggaggacg 


25 


10151 


ccgaggacga 


cgccgcacgc 




10201 


accgtctccc 


tcgtccaggc 




10251 


gtggttcctg 


acccgcggcg 




10301 


cccggcccct 


gcagagccag 




10351 


gagcacccgc 


agcgctgggg 


30 


10401 


cgcccgggcc 


gcccagcggc 




10451 


ccgaggacca 


gctcgccgtc 




10501 


gtgcgtgccg 


gacaccgcgc 




10551 


cggcaccacc 


ctgatcaccg 




10601 


cccgctggct 


ggccgaacgc 


35 


10651 


cgcggtgccg 


acgcccccgg 




10701 


gtcgggcacc 


gaggtgaccg 




10751 


cggtcgccgc 


gctgctggcc 



gtggtgtcgg gtgcgctgtc gttgcgggac 
gcggagtcag gccattggtc gtgcgttggc 
ccgtcgcgct gtcggtggac gtgctcgaac 
g99cgggtgt cggtggccgc cgtcaacggc 
cggcgagccc gaggcgctgg acgcgctgca 
acatccgggc ccgccggatc gcggtggact 
gtcgaggacc tgcacgagga actgctggag 
gcgcacgtcg gaggtgccgt tcttctcgac 
acaccgcgcg gatggacgcc ggctactggt 
gtgcggttcg cggacgcggt ggcggacctg 
gttcgtcgag gtcagctcgc acccggtgct 
tgatcgagga ggccggggtc acggccgtcg 
gaccagggtg gcgcgggccg cttcctgctg 
gcgcggtgtg gacgtggact gggcgggggc 
cccgggtcga cctgcccacc tacgccttcc 
acccgcaccg ccgccgaccg caccccggcc 
attctgggcc gccgtcgaac aggcggacgt 
tcggcaccga cgaggactcc gtcgccgcca 
tggcgccggg cccgctccca gcgcaccacc 
cgtcacctgg acgcccctcg cccaggtgcc 
cctggctgct ggtcaccacc gacggcatcg 
gcgttggaga gctacggcgc cgaggtgcgc 
gtgcaccgac cgcgccgtcc tgcgggagcg 
tgaccggcat cgtctccgtc ctcgccgccg 
caccccggcc tcacccgggg actcgcgctc 
cctgggcgac gccgaggcga ccgcgccgct 
ccttcgccac cggcccgtcc gaccccgtca 
atcgcgggcg tcggctggac caccgcgctg 
cggcaccgtg gacctgcccg acaccctcga 
tcgccgccgc gctgtccggc gccctcggcg 
cgcgccgccg gggtactggc ccgccgcatc 
cggacgaccg gcacggacct gggcgccgcg 
gcggctccgg caccctcgcc ccgcagctcg 
ggcgccgagc acgtggtgct ggtcagccgg 
agcgcccgaa ctcatcgcgg aggcagccga 
tcgccgcctg cgacatcacc gaccgcgacg 
gacctcacgg ccgacggccg caccctgcgc 
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10801 


accgtcatcc 


acgccgccgc 


cgccatcgag 


ctgtccgcgc 


tcgccgacac 




10851 


caccgtggcg 


gagttcgccg 


acgtcgtgca 


cgccaaggtc 


accggcgcac 




10901 


ggatcctcga 


cgaactgctc 


gacgacgcgg 


aactggacga 


cttcgtcctg 




10951 


tactcctcca 


ccgccggcat 


gtggggcagc 


ggcgtgcacg 


ccgcctacgt 


5 


11001 


cgccggcaac 


gcccatctgt 


ccgcgctcgc 


cgagcagcgc 


cgcgcccgcg 




11051 


gactgcgcac 


cacctccatc 


cactggggca 


agtggcccga 


cgaccgggca 




11101 


cgcgagctgg 


ccgacccgca 


ccggatccgc 


cgcagcggtc 


tggagtacct 




11151 


cgaccccgag 


ctggcgctca 


ccgcgctcca 


gcacgtcctg 


gacgacgacg 




11201 


agaccgtcat 


cggcctcatg 


gacatcgact 


gggacaccta 


ccacgacgtg 


10 


11251 


ttcaccgcgg 


gccggcccgc 


gcacctcttc 


gaccagatcc 


ccgaggtgcg 




11301 


gcgccgcctc 


gaccaggcat 


ccgtcccgga 


ccccgcgggc 


ccggccgccg 




11351 


acggcctggc 


cgcccgcctg 


cacggcctcg 


ccgccgccga 


acaggaccgg 




11401 


ctgctgctca 


ccctggtccg 


caccgaggcc 


gccgccgtcc 


tcggccacgc 




11451 


ctcggccgag 


tccttccccg 


agcgccgcgc 


cttccgtgac 


ctcggcttcg 


15 


11501 


actcggtcac 


cgccgtggac 


ctgcgcaacc 


ggctcgtggc 


cggcaccgga 




11551 


ctgcggctgc 


cctcgacgat 


ggtcttcgac 


caccccaact 


gcgcggcgct 




11601 


cgccgcgttc 


ctgaagacga 


cggcgctcgg 


cgtccccggc 


gccgcaccgc 




11651 


agcagcacgc 


cgctaccggc 


accccggccg 


acgacgaccc 


gatcgccgtg 




11701 


atcggcatga 


gctgccgcta 


ccccggcggc 


gccgccaccc 


ccgaggaact 


20 


11751 


gctgcggctc 


gccctcgacg 


gcgccgacgt 


catctcggag 


ttccccgcgg 




11801 


accgcggctg 


ggacgcccgg 


ggcctgtacg 


acccggaccc 


cgaccgcccc 




11851 


ggccacacct 


actccgtcca 


gggcggcttc 


ctccacgagg 


ccgccggctt 




11901 


cgatcccggc 


ttcttcggga 


tctccccgcg 


cgaggcggtc 


gccatggacc 




11951 


cgcagcagcg 


gctcctgctg 


gagacctcct 


gggaggcgtt 


cgaacgcgcc 


25 


12001 


ggtatcgacc 


ccgcgtcact 


gcgcggcagc 


gccgccggca 


ccttcttcgg 




12051 


cgccagctac 


caggactact 


cctccaccgt 


gcagaacggc 


acgggggagt 




12101 


ccgaggcgca 


catggtgacc 


ggcaccgcgg 


ccagtgtcct 


gtccggccgg 




12151 


gtctcctacc 


tgctcggcct 


ggagggcccc 


gcggtcaccg 


tggacaccgc 




12201 


ctgctcctcc 


tcactggtcg 


ccctgcacct 


ggcctgccag 


tccctgcgcg 


30 


12251 


acggcgagag 


ctccctcgcg 


ctggccggcg 


gtgcggccgt 


gatggccacc 




12301 


ccgcacgcgt 


tcgtcggctt 


cagccggcag 


cgtgccctgg 


ccaaggacgg 




12351 


ccgctgcaag 


ccgttctccg 


acaccgccga 


cggcatgacg 


ctcgccgagg 




12401 


gcgtcggcgt 


cgtcctgctg 


gagcgcctgt 


cccacgcccg 


cgccaacggg 




12451 


caccgggtac 


tggccgtgat 


ccgcggttcc 


gccgtcaacc 


aggacggcgc 


35 


12501 


ctccaacggc 


ctgaccgcgc 


ccaacggccc 


gtcccagcag 


cgcgtcatcc 




12551 


gccaggcgct 


cgccaacgcc 


ggcctgaccg 


gcgccgacgt 


cgacgcggtc 




12601 


gaggcgcacg 


gcaccggcac 


caagctgggc 


gaccccatcg 


aggcccaggc 
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12651 


cctgctcgcc 


acctacggcc 


aggaccgcga 


cgccgaacgg 


ccgctgctgc 




12701 


tgggctcggt 


gaagtccaac 


atcggccaca 


cccaggccgc 


ggccggcgtc 




12751 


gccggcgtca 


tcaagatggt 


gctggccatg 


gacgccggcg 


aactgcccgg 




12801 


caccctgcac 


ctcgacgcgc 


cctccagcca 


cgtcgactgg 


accgccggcg 


5 


12851 


ccgtggaact 


gctgcgcggg 


cgcaccccgt 


ggcccgagag 


cgggcgcccc 




12901 


cgccgggccg 


gtgtctcctc 


gttcggcatc 


agcggcacca 


acgcccacct 




12951 


gatcctcgaa 


caggccccgg 


ccaccgagcc 


gccagccgac 


cccgaccgcc 




13001 


tccgggacac 


cgccaccgac 


accgtcgtcc 


cctggccgct 


cgccgccaag 




13051 


tccccggccg 


ccctgcgcgc 


ccaggccgcc 


cggctcctcg 


ccaccgtcga 


10 


13101 


gcacgacccc 


gacctcccgc 


ccgcccccgt 


gggccacgcc 


ctggccacca 




13151 


cccgcgccgc 


cctcgaacac 


cgcgccgtcg 


tcgtcggcga 


gcgccgcgag 




13201 


gacttcctgc 


gcggcctggc 


cgccctgtcc 


accggcgcct 


cgacggccgg 




13251 


cctggtcagc 


ggcatcgccg 


gccccgaccc 


cgagggagcg 


gtcttcgtct 




13301 


tccccggcca 


gggatcccag 


tggtggggaa 


tgggccgcga 


actcctcgcc 


15 


13351 


acgtccgagg 


tgttccgcac 


cgcgatcgat 


gactgcgcga 


cggccctcgc 




13401 


cccgtacgtc 


gactggtcgc 


tgcacgacgt 


cctggccggc 


gagggcgacc 




13451 


ccgccctgct 


ggagcgggtg 


gacgtggtcc 


agcccgcgct 


gttcgccatg 




13501 


atggtcgggc 


tgtccgcgct 


ctggcgctcc 


cacggcgtcg 


tcccggcggc 




13551 


cgtggtcggc 


cactcgcagg 


gcgagatcgc 


cgcggcctgc 


gtcgccggag 


20 


13601 


ccctcagcct 


ggccgacgcc 


gcccgcgtgg 


tggcgctgcg 


cagccaggca 




13651 


ctgccgcaac 


tgtccggacg 


cggcggcatg 


atgtcggtct 


ccgcccccgt 




13701 


agagcgggtc 


accgcactcc 


tcgccccgtg 


gcaggaggcg 


ctgtccgtcg 




13751 


ccgcggtcaa 


cggcccctcg 


tccgtggtcg 


tctccggcga 


caccgacgcg 




13801 


ctcgacgccc 


tgcacaccgc 


ctgccaggaa 


cagggcgtgc 


gggcccgcaa 


25 


13851 


ggtgtccgtg 


gactacgcct 


cgcacgggcg 


gcacgtcgag 


gccgtccgcg 




13901 


acgaactcgc 


ccgcgtcctc 


gcgccggtcg 


acccgcgcgc 


ccccgaggtg 




13951 


ccgttctact 


cgacggtcac 


cggcgaccgc 


gtggacgacg 


ccgccttcga 




14001 


cggcgcctac 


tggtacacca 


acctccgcca 


gaccgtccgc 


atggaggagg 




14051 


ccacccgcgc 


cctcctcgcc 


gccggacacc 


gcgtcttcat 


cgaggtcagc 


30 


14101 


ccgcacccgg 


tgctcgccgc 


cccgatccag 


gagacgcagg 


aggccgtagc 




14151 


ggaggccacc 


ggcgggtccg 


cggtggtcct 


cggctcgctc 


cgccgcgacg 




14201 


agggcggccc 


gcggcgcttc 


ctgacgtcgc 


tcgccgaggc 


ccacacccac 




14251 


ggcgccccgg 


tcgactggac 


caccaccttc 


gcccggtccg 


cctaccagcc 




14301 


ggtggacctg 


ccgacctacc 


ccttccaacg 


acaggacttc 


tggcccgagg 


35 


14351 


cccggcccgc 


caccccggcc 


gccggcgccg 


acgcgtccga 


cgccgcgttc 




14401 


tggcaactgg 


tcgagaacca 


ggacctcgcc 


gcgctcgccg 


acgcgctcgg 




14451 


cgtccccgcc 


gacgacgagc 


acaccgcgct 


cggcaccgtg 


ctgccggccc 
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14501 


tgtccgcctg 


gcgcgccaag 


gcccaggccc 


gcacccggat 


cgacgaactc 




14551 


cgctaccacg 


tccagtggac 


ccgggtcgcc 


gagcccgcgg 


cggcccccac 




14601 


caccggccgg 


ctgctggtcg 


ccgtcccgcc 


ggaccacgcc 


gacgccccct 




14651 


gggtcgccgc 


ggcgctcgac 


gccc tgggca 


ccgacaccgt 


ccgc ttcgag 


5 


14701 


gccaagggca 


ccgaccgcgc 


999 at 999 cc 


gcacagatcg 


cccaactcgt 




14751 


cgaggacggc 


gaggagttca 


ccggcgtggt 


gtcgctgctg 


gccgccgccg 




14801 


aggatctcca 


cccggact tc 


ggctcggtac 


cgc tggggct 


ggggcagacc 




14851 


ctcgtcctcg 


tccaggccct 


cggcgacgcc 


ggcctgaccg 


cgcccctgtg 




14901 


gtgcctgacc 


cgcggcgccg 


tcgccaccgg 


ccgcgacgac 


gcccucgaca 


1 U 


1495 1 


gcccgaccca 


gggcgccctg 


tggggcctcg 


gccgggtcgt 


ggccctggaa 




15001 


caccccgacc 


gctggggcgg 


cctgatcgac 


ctgcccgcca 


ccctcgacgc 




15051 


ccgcgccgcg 


gcccgcctca 


ccggcctgct 


cgccgacccc 


gccggtgagg 




15101" 


accaactcgc 


cgtccgcgcc 


accggcgtgc 


tcgcccgccg 


catggtgcac 




15151 


gccgcgccgt 


ccgcgccccg 


caccgggcgc 


cgctggcggg 


gccgcggcac 


15 


15201 


ctgcctgatc 


accggcggca 


ccggcggcat 


cggcggccgg 


gtcgcccgct 




15251 


ggatggccga 


gcacggcgcc 


gcccacctgg 


tcctgaccag 


ccggcgcggc 




15301 


ccggacgcac 


ccggcgccgc 


cgcactccgg 


gccgaactgg 


aggccctggg 




15351 


cgcccgcgtc 


accctcgccg 


cctgcgacgt 


cgccgaccgc 


gacgccctgg 




15401 


ccgccctgct 


ggccgacctc 


cccgccgacc 


agccgctcac 


ctccgtcttc 


20 


15451 


cactccgccg 


gcgtggccga 


cggggacgcc 


cgggcagccg 


acctgaccct 




15501 


agatcagctc 


gacgcgctgc 


tgcgcgccaa 


actgaccgcc 


gcccaccacc 




15551 


tgcacgagct 


gaccgccccc 


ctcgacctcg 


acgcgttcgt 


gctcttctcc 




15601 


tccggcgccg 


cggtctgggg 


cagcggcggc 


cagcccggct 


acgccgccgc 




1565 1 


caacgcctac 


ctcgacgccc 


tcgccgccca 


ccgcaggtcc 


ctcgacctgc 


2 5 


15701 


ccggcgcgtc 


cgtcgcctgg 


ggcacctggg 


gcgaggtcgg 


catggccacc 




15751 


gtccccgagg 


tccacgagcg 


actgcaccgc 


caaggggtcc 


gcgccatgga 




15801 


accggaccac 


gcgatcggcg 


cgctccagca 


gatgctggag 


gacgacgaca 




15851 


ccaccctcgc 


cgtgaccctc 


atggactggg 


aggcgttcgc 


accgagcttc 




15901 


accgcgaccc 


gacccagcgc 


cctgttcagc 


acggtgcccg 


aagccgtccg 


3 0 


15951 


cgcggtgacc 


ggcgacccgg 


gcaccacggc 


cggcgacgac 


gtggactccg 




16001 


cgaccccgcc 


gctccgccgc 


cacctggagg 


agctgtccgc 


cgccgagcgc 




16051 


ggccgggccc 


tggtcgaggc 


ggtccgcgcc 


gaggcgtccg 


cgaccctcgg 




16101 


ccacgacacc 


cccgacgcca 


tccccgccgg 


ccgtgccttc 


cgcgacgtcg 




16151 


gcttcgactc 


ggtcaccgcc 


gtcgaactgc 


gcaaccggct 


gcgcaccgcc 


35 


16201 


ctcggcctgc 


cgctgccggc 


cgcgctcgtc 


ttcgaccacc 


ccacccccac 




16251 


ggccctcgcc 


ggccacctcg 


gcgcgctgct 


cttcggcacc 


gcccccgagg 




16301 


acgccggcac 


cggccgcccc 


gacgaccccg 


acgcccgcat 


ccgcgaggcg 
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16351 


ctcgccaccg 


tccccatcgg 


acggctgcgc 


aaggcgggcc 


tcctcgacat 




16401 


ggtgctgaaa 


ctcgccgacg 


gagacgcgac 


cgacgccccc 


gcccccgagg 




16451 


ccgacgcccc 


ctcggaatcc 


ctcgacgaca 


tggacgccga 


agccctgctg 




16501 


cggctggcca 


ccgagaactc ggcgaactga 


aagagagctg 


gagcccacca 


5 


16551 


tgagcacgaa 


ccccgacaag 


tacgtcgagg 


cactccggtc 


gtccctgaag 




16601 


gagatcgagc 


ggctgcgccg 


gcagaacgaa 


cagctggtgg 


cggcggcggt 




16651 


cgagcccgtc 


gcggtcgtcg gcatcggctg 


ccgcttcccc 


ggcggcgtca 




16701 


cctcccccga 


ggacctgtgg 


gagctggtcg 


ccgaggggcg 


cgacgtcatc 




16751 


gggccgttcc 


cgcaggaccg 


cggctgggac 


ctggagaagt 


tggccggcgg 


10 


16801 


cggcgagggc ggcagcctcg cgcaggtcgg cggcttcgtc 


gaggacgccg 




16851 


ccggcttcga 


ccccggcttc 


ttcggcatct 


ccccgcgcga 


agcggtcgcc 




16901 


atggacccgc 


agcagcgcat 


cctgctggag 


atcacctggg 


aagccctgga 




16951 


acgcgccggc 


atcgacccgt 


ccaccctgcg 


cggtaccccc 


accggcgtct 




17001 


tcgtcggcac 


caccggccag 


gactacggcg 


aggtcatcaa 


ggcgtccgcc 


15 


17051 


gaggacgtcg 


aggtctactc 


gaccaccggc 


cacgccgcca 


gcgtcatctc 




17101 


cggccggctc 


tcctacaccc 


tcggcgccga 


gggcccggcc 


gtcaccgtcg 




17151 


acaccggctg 


ctcctcgtcc 


ctggtcgccc 


tgcactgggc 


cgtccaggcg 




17201 


ctgcgcggcg gcgagtgctc 


catggccctg 


gccggcggcg 


cgtccatcat 




17251 


ggccaccccg 


ggcccgttcg 


tcgccttcac 


cgcgcagagc 


ggcctggccg 


20 


17301 


ccgacggccg 


ctgcaagccc 


ttctccgacc 


gggccgacgg 


caccggctgg 




17351 


ggcgagggcg 


ccggcatgct 


ggtcctgatg 


cggctctccg 


acgcccagcg 




17401 


cgagggccgc 


ccggtcctcg 


ccgtgctgcg 


cggctccgcc 


atcaaccagg 




17451 


acggcgcctc 


caacggcctg 


accgctccca 


acggcccctc 


ccagcagcgc 




17501 


gtcatccgcg 


ccgcgctgga 


cagcgcccac 


ctcaccgccg 


ccgacatcga 


25 


17551 


cgccgtcgag 


gcccacggca 


ccggcaccac 


cctcggcgac 


ccgatcgagg 




17601 


cccaggcgct 


cctggcgacc 


tacggacagg 


accggccgcg 


gcccctgtgg 




17651 


ctcggctcgg 


tgaagtccaa 


catcggccac 


acccaggccg 


cctccggtgc 




17701 


cgccggcgtg 


atcaaaatga 


tcatggcgtt 


gcagcgcggc 


gtgctgccgc 




17751 


gcagcctgca 


cgccaccgaa 


cccaccacgg 


acgtcgactg 


gaccgccggc 


30 


17801 


tccgtcgacc 


tcctcgacga 


gacggtcgcc 


tggcccgaga 


ccggacgcgc 




17851 


ccgccgcgcc 


ggcgtctcct 


ccttcggcat 


cagcggcacc 


aacgcccacg 




17901 


tcatcctcga 


acaggccccc 


accgcccccg 


aagagcccac 


caccgaaccc 




17951 


accgtccgcc 


ccgccgtcgt 


cccgtgggcg 


ctctccgccc 


gcaccgccgc 




18001 


cgccctcgac 


gcccagcgcg 


cccgcctcac 


cggccacctc 


gccgacaccc 


35 


18051 


ccgacgccga 


ccccctcgac gtcggctacg 


cgctcgccga 


cggacgcgcc 




18101 


accttcgaac 


accgcgccgt 


cctgctcccc 


gacggcaccg 


aactcgccca 




18151 


cggaaccgcc 


ggcgaaggcc 


cctgcgccgt 


cctcttctcc 


ggccagggct 
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18201 


cccagcgccc 


gggcatggga 


cgcgaactcc 


acgcccgctt 


cccggtgttc 




18251 


gccgcggcct 


tcgacgagat 


cacagcgctc 


ctcgacaccc 


acctcgaccg 




18301 


cccgctgcgc 


gaggtcgtct 


ggggcaccga 


cgccgacctg 


ctgaacgaca 




18351 


ccggctgggc 


ccaacccgcc 


ctgttcgccg 


tcgaggtcgc 


cctctaccgc 


5 


18401 


ctggtcgcgt 


ccctcggcgt 


cacccccgac 


ttcgtcggcg 


gccactccat 




18451 


cggcgagctc 


gccgccgcgc 


acgtcgccgg 


ggtcctctcc 


ctcgaagacg 




18501 


cctgcaccct 


cgtcgccgcc 


cgcgcccgcc 


tcatgcaggc 


cctgccgcgc 




18551 


ggcggcgcga 


tgctcgcgat 


ccgcgccacc 


gaggacgagg 


tcacccccca 




18601 


cctcaccgac 


gacgtctcga 


tcgccgccgt 


caacgggccc 


acctccgtcg 


10 


18651 


tcgtcgccgg 


caccgaggaa 


gccgtcgccg 


cgatcggggc 


gcgcttcacc 




18701 


gcccaggacc 


gcaagaccac 


ccggctgcgg 


gtcagccacg 


ccttccactc 




18751 


gccgctcatg 


gacccgatgc 


tggcggaatt 


ccgcgccgtc 


gccgcgggcc 




18801 


tgacctacca 


cgagccgcgc 


atcccggtcc 


tctccaacct 


caccggcacc 




18851 


gtcgccgccg 


tcgccgacct 


gtgctccgcc 


gactactggg 


tccgccacgt 


15 


18901 


ccgcgaggcg 


gtccgcttcg 


ccgacggcgt 


caccgccctc 


accgaccgcg 




18951 


gcgtgaccac 


gctcgtcgaa 


ctcggcccgg 


acggcgtgct 


gtccgccatg 




19001 


gcccaggaat 


ccctgccgga 


cggcgccgcc 


gccgtgccgc 


tgctgcgcaa 




19051 


ggaccgcccc 


gaggagctct 


ccgccgtcac 


cggcctggcc 


cgcgcccacg 




19101 


tccgcggcgt 


cacggtccgc 


tgggccggcc 


tcttcgacgg 


caccggcgcg 


20 


19151 


cgccgcgccg 


acctgcccac 


ctaccccttc 


cagcaccagc 


ggttctggcc 




19201 


gaccgcggcc 


cgcgccgccc 


aggacgtcac 


cgccgcggga 


ctgggcgccg 




19251 


ccgaccaccc 


gctgctcggc 


gccaccgtcg 


aactcgccga 


cggggccggc 




19301 


tacttgttca 


ccagccggct 


ctccgtccgg 


acccacccct 


ggctcgccga 




19351 


ccacggggtc 


cagggccggg 


ccctgctgcc 


cggcaccgcc 


ttcgtcgaac 


25 


19401 


tggccgtccg 


cgccggcgac 


gaggccggct 


gcgaccgcgt 


cgaggaactg 




19451 


accctggccg 


cccccctggt 


gctgcccgag 


cgcggcggcg 


tccaactcca 




19501 


ggtccgcgtc 


ggcgcccccg 


acgccgccgg 


ccgccgcacc 


ctcggcatct 




19551 


tctcccgcgt 


cgaggacggc 


ttcgacctgc 


cctggtcgca 


acacgccacc 




19601 


ggcgtcctga 


ccgccggcgc 


cggcgccccc 


gaccccacct 


tcgacgccac 


30 


19651 


cgtctggccc 


cccagcggcg 


ccgaacccgt 


cgacctcacc 


ggcgcgtacg 




19701 


agcgcctggc 


cgcactcggc 


ttccagtacg 


gccccgcctt 


ccagggcctg 




19751 


cgcgccgcct 


ggcgccgcga 


caccgaggtc 


tacgccgaag 


tggccctgcc 




19801 


cgacggcgcg 


gacaccgacc 


ccgccgcctt 


cggactgcac 


ccggccctgc 




19851 


tggacgccgc 


acaacacgcc 


gccgcctacg 


ccgacctcgg 


cgccatcagc 


35 


19901 


cgcggcggcc 


tgccgttcgc 


ctgggaaggc 


gtctcgctcg 


ccgccgccgg 




19951 


cgccaccacc 


gtccgcgccc 


ggatcgcccc 


ggccggcgag 


gacaccgtca 




20001 


ccatcgccgt 


ctacgacgcc 


gccggcggca 


ccgtgctgtc 


cgtcgactcc 
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20051 


ctggtctccc 


gcgaggtccc 




20101 


ccaccgcgac 


tccctcttcc 




20151 


cgggccccgc 


accggccacc 




20201 


ctcgccgaca 


ccctccgcgc 


5 


20251 


cctggccgcc 


ctcgccgacg 




20301 


ccaccctcac 


caccaccccg 




20351 


accaccgccg 


ccgtcctcgc 




20401 


cttcgccgac 


gcccgcctgg 




20451 


ccgaccccgc 


cgccgcggcc 


10 


20501 


gagaaccccg 


gccgtttcgc 




20551 


gcccgacccc 


gagaccctgg 




20601 


ccgacctcgc 


cgtccgcggc 




20651 


gtcccgctcg 


ccaccgaacc 




20701 


gatcaccggc 


ggcaccggcg 


15 


20751 


tcgccaccca 


cggcgtccgc 




20801 


gccgccgacg 


gcgccgacga 




20851 


caccgtccac 


atcgccgcct 




20901 


acctgctcgg 


caccgtcccg 




20951 


accgccggcg 


tcgtcgacga 


20 


21001 


cctggacacc 


gtcctgcggc 




21051 


aggcgacccg 


ccacctcgac 




21101 


gccgccaccc 


tcggcagccc 




21151 


cttcctggac 


gccctcgccg 




21201 


cctccctcgc 


ctggggcccg 


25 


21251 


ctgtccgacc 


tcgacgtcga 




21301 


gaccctggaa 


cagggcaccg 




21351 


ccgccgccct 


cgccccggtc 




21401 


ggcgacatcg 


ccccgctgct 




21451 


caccgccgcc 


caggtctcgc 


30 


21501 


ccggcctcga 


cgccgccgcc 




21551 


acccagatcg 


cccaggtcct 




21601 


cggccgccag 


ttccaggacc 




21651 


tccgcaacgc 


cctgaacacc 




21701 


gtgttcgact 


acccgacacc 


35 


21751 


actcctgggc 


accgaggccg 




21801 


gtaccgccgg 


caccgacgac 




21851 


taccccggcg 


gcatcgcctc 



cgccgacgca cccggcgccg ccggcaccgt 
acgtcgagtg gaccccgctc cagggccgcc 
gtcgccgtcc tcggccccga cccggacgcc 
caccggcatc cggaccaccg ccccccgcga 
ccgaagggcc cgtccccgac ctggtcgtca 
ggcgcccccg tccccgacgc cgcgcacgcc 
cctcgcccaa cagtggctcg ccgacgaccg 
tcctcgtcac ccgcggcgcc accgacggca 
gccggcggcc tgatccgcac cgcccgcacc 
cctcctcgac ctcgcccccg acaccggccg 
ccaccgccct ggccgccagc cacgacgagc 
accgacgtgc acgccgcccg cctggcccgt 
caccacctgg aacccggacg gcaccgtcct 
gcctgggcgc ggtcctcgcc cgccacctgg 
cacctgctgc tcgccagccg ccgcggcccg 
cctgacggcc gaactcaccg ggctcggcgc 
gcgacgtcgc cgacccggcc gccctcgccg 
gccgggcacc cgctcaccgt cgtcgtccac 
cggcgtcctc ggctccctca ccccgcagcg 
ccaaggccga cgccgcctgg cacctgcacg 
ctggacgcct tcgtcccctt ctcgtccgtc 
cggacaggcc aactacgccg ccggcaacgc 
cccggcgcgc cgccaccggc ctgcccgcca 
tggacccaga gcgtcggcat gacaagcagc 
gcgcatcgcc cgctccggca tgcccccgct 
ccctcttcga cgcggccctg gccgccgggc 
cgcctcgacc tgcccgtcct gcgcacccag 
gcgcggcctg atccgcaccc ccgtgcggcg 
agaccgccga cggcctcgcc cagcggctcg 
cgccgggaag ccctcctgga actcgtccgc 
cggccacgcg gacgccaccg aggtggagac 
tcggcttcga ctccctcacc gccgtcgaac 
gccaccggcc tgcggctgcc cgccaccatg 
acacgccctc gccgaccacc tgcgcgacga 
agtcgaccac cgccgtcccc gtgccgaccc 
ccgaccgtca tcgtcggcat ggcctgccgc 
acccgaggac ctctggcgcc tggtcagcca 
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21901 gggcgccgac gccactggcc 

21951 acaacctcta cgaccccgac 

22001 gccggcggct tcctgcacga 

22051 gatgagcccg cgcgaggcga 

5 22101 tcgaactctc ctgggaagcc 

22151 ctgcgcgact ccggcaccgg 

22201 cggcaccacc ctgaccggcg 

22251 gcgccccgag cgtcgcctcc 

22301 ggcccggccg tcacggtgga 

10 22351 gcactgggcg gcgcaggcgt 

22401 ccggtggtgt gacggtgatg 

22451 cggcagcggg gtctggcgcc 

22501 cgcggacggc gtggcctggt 

22551 ggcagtcgga cgcggtgcgc 

15 22601 ggctcggcgg tcaaccagga 

22651 cggcccgtcc cagcagcggg 

22701 tgtccacggc cgacgtggac 

22751 ctcggtgacc cgatcgaggc 

22 801 ccgcgacccc gagaacccgc 
20 22851 gtcacaccca ggcagcggcc 

22901 gcgatgcggc acggcgtgct 

22951 ctcgcacgtc gattggagcg 

23 001 ccgcctggcc ggagaccggc 
23051 ggcatcagcg gcaccaacgc 

25 23101 cgtccccgcc acgcccgcgt 

23151 ccgtcccctg ggccctgtcc 

23201 gccgcccgcc tcctcgccca 

23251 cgacatcagc tactccctga 

23301 ccgtcgtcct cggcaccgac 

30 23351 ctcgccgccg gcgagaccga 

23401 cggccgcacc gccttcctct 

23451 tgggccgcgt cctctacgag 

23501 accgtcctca ccgccctcga 

23551 catctggggc gaggacgctc 

3 5 23601 ccgccctgtt cgccatcgag 

23651 ggcatcacac cggacttcgt 

23701 cgcacacgtc gccggcgtgc 



cgttccccac caaccgcggc tgggacctgg 
cccgaccgcc cgggccgcac ccacgtccgc 
cgccggctcc ttcgacgccg acttcttcgg 
tggccaccga ctcccagcag cgcctgctgc 
gtcgaacgcg ccggcatcga ccccgcctca 
cgtcttcgcc ggcgtcatgt acaacgacta 
acgagtacga ggcgttccgc ggcaacggca 
ggccgcgtct cctacaccct cggcctggaa 
caccgcctgc tcttcctctc tggtcgccct 
tgcgggcggg ggagtgctcg ttggcgttgg 
tcgacgccga gcacgttcgt ggagttctcg 
tgatggtcgt tcgaaggcgt tcgccgaggc 
ccgagggcgt cggcatgctg gtcctggagc 
aacggtcacg agatcctggc cgtggtgcgc 
cggtgcgtcc aacggtctga ccgcgcccaa 
tgatccgtca ggcgttggcc agtggcggcc 
gccgttgagg cgcacggcac gggtacgacg 
ccaggcgctc ctggccacct acggtcgcga 
tgctgctcgg ctcgatcaag tccaacatcg 
ggtgtcgccg gtgtcatcaa gatggtcatg 
gccgcagacc ctgcatgtcg acgcgccgtc 
tcggcgccgt cgaactgctc accgagcaga 
cgggcccgtc gcgccggtgt ctcctccttc 
ccacgtcgtc atcgagcagt ccccgaccgc 
ccgccgaccg gtccgtcgag gaaccgccgg 
ggcaagaccc ccgacgccct ccgcgaccag 
cgtcgaggcc caccccgcac tgcgccccgt 
tcgccacccg caccgccttc gaccaccgcg 
cgcgccgagg ccctgcgcgc cctcaccgcc 
cccggccgcc ctcaccggca ccgtccgcac 
tctccggcca gggctcccaa cggctcggca 
cggttccccg ccttcgccga agccctcgac 
cgcggaactc ggccaccccc tccgcgacat 
aactcgtcga ccggaccggc tacacccaac 
gtggcactct tccgcctcct ggaagcctgg 
ggccggccac tccatcggcg agatcgccgc 
tctccctcgg cgacgcctgc cgcctcgtcg 
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23751 


tggcccgcgc 


cgtgctgatg 


cagtcgctgc 


ccgaaggcgg 


cgcgatgatc 




23801 


gccgtccagg 


ccaccgagga 


cgaggtcctg 


cccctcctca 


ccgacgacgt 




23851 


ctcgatcgcc 


gccgtcaaca 


gcccgacctc 


cgtcgtcgtc 


tccggctacg 




23901 


agaacgccac 


cctcgccgtc 


gcccggcact 


tcgccgacca 


gggccgccgc 


5 


23951 


accacgcggc 


tgcgcgtcag 


ccacgccttc 


cactcgccgc 


tgatggcgcc 




24001 


gatgctcgac 


gacttccgcg 


ccgtcgtcga 


gagcctcacc 


ttcaccgccc 




24051 


ccacgacccc 


cgtcgtctcc 


aacctgaccg 


gcgaactggc 


cccggccgag 




24101 


gcgctctgct 


cggccgacta 


ctgggtccgg 


cacgtccgcg 


aggcggtccg 




24151 


cttcgccgac 


ggcatccgca 


ccctcgccga 


ccgcggcgtc 


accaccttcg 


10 


24201 


tcgaactcgg 


ccccgacagc 


gtgctgtccg 


ccatggccca 


ggagtccgcc 




24251 


cccgaaggcg 


ccggcaccat 


cccgctcctg 


cgccgcgacc 


ggcccgagga 




24301 


acaggccgtc 


ctggccgccc 


tctgccacct 


ccaggtgctc 


ggcgtcgagg 




24351 


ccgactggtc 


cgccaccttc 


cgcggcctcg 


accccgtccg 


cgtcgacctg 




24401 


ccgacctacg 


ccttccagca 


ccgctggttc 


tggcccgccg 


cccgacccgc 


15 


24451 


ccgccccgac 


gacgtccgcg 


ccgccggcct 


gggcgccgcc 


gaacaccccc 




24501 


tcctcggcgc 


cgccgtgcaa 


ctccccgacg 


acgacggcgc 


actcttcacc 




24551 


ggccgcctct 


ccctgcgcac 


ccacccgtgg 


ctggccgacc 


acaccgtcct 




24601 


gggcaccgtc 


ctgctcccgg 


gcaccgcact 


ggtggaactc 


gccgtccgcg 




24651 


cgggcgacga 


gaccggcagc 


ggccacctcg 


aagaactcac 


cctcgccgcg 


20 


24701 


cccctgaccc 


tccccgagga 


cggcgccacc 


ctcctccagg 


tccgcgtcgg 




24751 


atccgccgac 


gacaccggcc 


gccgcaccgt 


caccgtccac 


gcccgccccg 




24801 


acgacaccgc 


cgaccgcacc 


tggacgctgc 


acgccaccgg 


tgtgctcgcc 




24851 


accacgccac 


cggccgccgc 


ggcgttcgac 


accacggtct 


ggccgcccgc 




24901 


cgacgccgaa 


cccctcacca 


ccgacgactg 


ctacgcacac 


ttcaccaccc 


25 


24951 


accgcttcgc 


ctacggcccc 


gccttccagg 


gcctgcgggc 


cgcctggcgc 




25001 


gccggcgacg 


tgctgtacgc 


cgaggtcgcc 


ctgccggagt 


ccgccaccga 




25051 


cgaagcggcc 


gccttcggcc 


tgcacccggc 


gctcctggac 


gccggcctgc 




25101 


acgccgcgct 


cctcgccgac 


gaccgcgaca 


ccggactccc 


gttctcctgg 




25151 


gaaggcgtca 


ctctgcacgc 


ctccggcgcc 


accgcgctac 


gcgtccggct 


30 


25201 


cgccccgaac 


ggccccaacg 


gcctgtccgt 


caccgccgcc 


gacccggccg 




25251 


gcaaccccgt 


cgccaccgtc 


acccgcctgc 


tcgcccgccc 


cctggacgcc 




25301 


gagcagttga 


ccatccacag 


cgccctgacc 


cgcgacgcgc 


tcttccacct 




25351 


ggactggacc 


ccggtcccgc 


ttcccgacac 


cgccaactcc 


gcgccgccgg 




25401 


ccctcctcgg 


cccggacacc 


gccgtgctcg 


ccgacgccct 


cggcgacccg 


35 


25451 


gccgtcgcac 


gccacgcaac 


cctcgacgac 


ctcctggccg 


gggacaccac 




25501 


cccgcccgcc 


acggtcctcg 


tccccctcgg 


cgccccactc 


gacggcgaca 




25551 


ccgcgcagca 


cgcgcacgcc 


ctcacccgca 


gcgcgctgac 


cctcgtccag 
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25601 


cagtggctcg 


ccaccgaccg 


cctcgccgac 


tcccgcctgg 


tcttcgtcac 




25651 


ccacggagcc 


gtcgccaccg 


acgacgcgcc 


ccccaccgac 


ctggccgccg 




25701 


ccgcggtctg 


gggcctgatc 


cgctccgcgc 


agaccgagaa 


ccccggcacc 




25751 


ttcaccctcc 


tcgacctcga 


caccgagccc 


gactcgacca 


ccgcgctcag 


5 


25801 


ccgcgccctg 


accctcgacg 


aaccacagct 


cctcctccgc 


gccggccgcg 




25851 


cccgcgccgc 


ccgcctcacc 


cgcacccccg 


cccccaccac 


caccacccac 




25901 


acgccgtggt 


ccgcggacgg 


aacggtgttg 


gtgacgggtg 


gtacgggtgg 




25951 


tctgggtggg 


ttggtggccc 


ggcatctggt 


gcggtcgtgt 


ggggtgcggc 




26001 


atttgttgtt 


gaccagtcgt 


tctggtgtgg 


gtgctgcggg 


tgcggccggg 


10 


26051 


ttggtcgcgg 


agttggagtc 


gttgggcgcg 


cgggttgtgg 


ttgcggcgtg 




26101 


tgatgtgggt 


gatggctcgg 


ctgttgcgga 


gttggttgcc 


ggtgtgtcgg 




26151 


agtcgtatcc gttgtctgcg 


gtggtgcatg 


cggctggtgt 


gttggatgac 




26201 


ggtgtggtgg gttcgttgac 


gccggagcgg 


ttggctgcgg 


tgttgcgtcc 




26251 


gaaggtggat 


ggtgcgtgga 


acctgcatga 


ggcgacgcgt 


ggtctggatc 


15 


26301 


tggacgcgtt 


tgttgtcttc 


tcgtctgttg 


cgggtgtgtt 


cgggggtgcg 




26351 


ggtcaggcca 


actatgcggc 


gggtaatgcg 


tttttggacg 


cgttgatggt 




26401 


tcatcgggtg 


gctggtgggt 


tgcctggtgt 


gtcgttggcg 


tggggtgctt 




26451 


gggatcaggg 


tgtggggatg 


acggcggggc 


tgacggagcg 


ggatgtccgt 




26501 


cgtgctgctg 


agtcgggtat 


gccgttgttg 


acggttgatc 


agggtgtggc 


20 


26551 


gttgttcgat 


gcggcgttgg 


cgacggggag 


tgccgcgttg 


gtgccggtcc 




26601 


gtctggacct 


ggccgcactg 


cgcacccggg 


gcgacatcgc 


accgctcctc 




26651 


cgcggcctcg 


tccgcgcacc 


gctgcgccgc 


accgcggcca 


ccggcctcgc 




26701 


caccggcgcg 


gacaccggcc 


tcgtccaacg 


gctcggccga 


ctcgaccacg 




26751 


cccaacgcca 


cgaggcactg 


ctcgacatgg 


tccgcagcag 


cgccgcgctc 


25 


26801 


gtcctcggcc 


acgccgacgg 


caacgccatc 


gacgccgaac 


gcgccttccg 




26851 


cgacctcggc 


ttcgactcgc 


tcaccgcggt 


cgaactccgc 


aaccgtctgc 




26901 


gcaccgccac 


cggcctgcac 


ctgtcggcca 


ccatggtctt 


cgaccacccc 




26951 


accctgtccg 


ccctcgcgga 


gcacctgcgg 


gacgagttgt 


tcggcgcggt 




27001 


cgagagcgag gtgcgggtgc 


cggtccaggc 


actgccgccg 


accgccgacg 


30 


27051 


atcccatcgt 


ggtggtgggc 


atggcctgcc 


gtttccccgg 


tggtgtgacc 




27101 


tcgcccgagg 


acctgtggcg 


cctggtcgac 


gacggcaccg 


acgccatcac 




27151 


caccttcccg 


accaaccgcg 


gctgggacct 


ggacaacctc 


tacgacccgg 




27201 


accccgagca 


cttcggcacg 


tcgtacaccc 


gctccggtgg 


cttcctgcac 




27251 


gaggcggggg 


agttcgaccc 


ggcgttcttc 


ggaatgagcc 


cgcgtgaggc 


35 


27301 


gctggcaacc 


gactcccaac 


agcgtctcct 


gctggaatcc 


tcctgggagg 




27351 


cgatcgagcg 


ggccggcatc 


gacccgctga 


ccctgcgcgg 


cagcgccacc 




27401 


ggcgtcttcg 


ccggcgtgat 


gtacagcgac 


tacgggagca 


tcctcggcgg 
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27451 


caaggagttc 


gagggcttcc 


aaggccaggg 


aagtgcgggc 


agcgtggcct 




27501 


cgggccgcgt 


ctcctacgcc 


ctcggcttcg 


agggcccggc 


cgtcacggtg 




27551 


gacacggctt 


gctcttcctc 


tctggtcgcc 


ctgcactggg 


cggcgcaggc 




27601 


gttgcgggcg 


ggggagtgct 


cgttggcgtt 


ggccggtggt 


gtgacggtga 


5 


27651 


tgtcgacgcc 


gagcacgttc 


gtggagttct 


cgcggcagcg 


gggtctggcg 




27701 


cctgatggtc 


gttccaaggc 


gttcgccgag 


gccgcggacg 


gcgtcggctg 




27751 


gtccgagggc 


gtcggcatcc 


tcgtcctgga 


gcgccagtcg 


gacgcggtgc 




27801 


gcaacggcca 


cgagatcctc 


gccgtgatcc 


gcggctcggc 


ggtcaaccag 




27851 


gacggtgcgt 


ccaacggcct 


gaccgcgccc 


aacggcccgt 


cccagcagcg 


10 


27901 


cgtcatccgt 


caggcgttgg 


ccagtggcgg 


cctgtccacg 


gccgacgtgg 




27951 


acgccgtcga 


ggcgcacggc 


acgggtacga 


cgctcggtga 


cccgatcgag 




28001 


gcccaggcgc 


tcctggccac 


ctacggccgt 


gaccgcgacc 


ccgagaaccc 




28051 


cctgtggctg 


ggctccctga 


agtccaacat 


cgggcacacc 


caggcagcgg 




28101 


ccggtgtcgc 


cggtgtcatc 


aagatggtca 


tggcgatgcg 


gcacggcgtg 


15 


28151 


ctgccgcaga 


ccctgcatgt 


cgacgcgccg 


tcctcgcacg 


tcgattggag 




28201 


cgtcggcgcc 


gtcgaactgc 


tcaccgagca 


gaccgcctgg 


ccggagaccg 




28251 


gccgggtccg 


tcgcgccggt 


gtctcctcct 


tcggcatcag 


cggcaccaac 




28301 


gcccacgtca 


tcgtggaaca 


gccggcgctc 


gtcgaaagcc 


cggccgcgga 




28351 


gccgagcgga 


cgcgaacccg 


gcgtcgttcc 


gctgccgctg 


tccggaaagt 


20 


28401 


cccccgaggc 


cctgcgcgac 


caggccgcac 


gcctgctggc 


cgggttggcg 




28451 


gagcggcccg 


cgctgcgccc 


gctcgacctc 


ggctactcgc 


tggcgacgac 




28501 


ccgttcggcg 


ttcgaccacc 


gggcggtggt 


gctcgccacc 


gaccgcgccg 




28551 


atgcggtccg 


cgcgctgacg 


gcgctcgccg 


ccgccgacgc 


ggatctctcc 




28601 


gccgtcgtcg 


gcgacacccg 


cacgggtcgt 


cacgcggtgc 


tgttctcggg 


25 


28651 


tcagggctcg 


caacgcctgg 


gcatggggcg 


tgagttgtac 


gagcgtttcc 




28701 


cggtcttcgc 


cgaggctctc 


gatgtcgcga 


tcgaccacct 


ggacgccgcc 




28751 


ttgcccgccc 


aggccagtct 


gcgtgaggtg 


atgtggggcg 


acgatgtcga 




28801 


gctgctggac 


gagacgggtt 


ggacgcagcc 


ggctctgttc 


gccgtcgagg 




28851 


tggccctgtt 


ccggctggtg 


gagagttggg 


gtgtccgtcc 


ggacttcgtg 


30 


28901 


gccggtcatt 


ccatcggtga 


gatcgcggcg 


gcgcatgtcg 


tcggggtgtt 




28951 


ctcgctggag 


gacgcctgcc 


gtctggtggc 


cgcccgtgcg 


acgctgatgc 




29001 


aggcgctgcc 


gaccggcggc 


gcgatgatcg 


cgatccaggc 


cgccgaggac 




29051 


gaagtcaccc 


agcacctgac 


tgacgacgtc 


tcgatcgccg 


ccgtcaacgg 




29101 


cccgacctcc 


gtggtcgtct 


ccggggccga 


gagcgctgcc 


cgcacggtgg 


35 


29151 


ccgaccggct 


cgcggagaac 


ggccgcaaga 


cgacccggct 


gcgggtctcg 




29201 


catgcgttcc 


actcgccgtt 


gatggatccg 


atgctggcgg 


agttccgtgc 




29251 


ggtggccgag 


ggcctgtcct 


acgccacccc 


gaccctcccc 


gtcgtctcga 
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29301 acctgacggg ccggctggcc acggccgatg acctctgctc ggccgagtac 
29351 tgggcgcgcc acgtccgcga ggcggtccgc ttcgccgacg gcgtcagcac 
29401 cctggagaac gagggcgtca ccacgttcct ggaactggga ccggacggcg 
29451 tgctgtccgc catggcccag cagtcgctca ccggcgacgc cgccaccgtc 
5 2 9501 ccggccctcc gcaaggaccg cgacgaggag acgtccgcgc tcaccgccct 
2 9551 cgcccacctc cacacggcag gtctccgcgt cgactgggcg gcgttcttcg 
29601 ccggcagcgg cgccacccgc gtggacctgc cgacctacgc cttccagcac 
2 9651 gccacctact ggcccaccgg caccctgccc accgcgcacg ccgcggccgt 
29701 cggcctcacc gccgccgagc acccgctgct gaacggttcg gtcgaactcg 
10 29751 ccgaaggcga aggggtgttg ttcaccggac ggctgtcact gcagtcacat 
29801 ccgtggctgg ccgaccacgc cgtcatggga caggtcctgc tgcccggcac 
29851 cgcactgctg gaactggcgt tccgggccgg cgacgaggcc ggttgcgacc 
29901 gcgtcgagga actgacgctc gccgcaccgc tcgtcctgcc cgagcgcggt 
29951 gcggtacaga cccaggtccg ggtcggcgtc gccgacgaca ccggccgccg 
15 30001 taccgtcacc gtccactccc ggcccgagca cgcgaccgac gtgtcgtgga 

30051 cccagcacgc gaccggcacc ctgaccatgg gctccgcccc ggccgacacc 
30101 ggtttcgacg ccactgcctg gccgcccgcc gacgccgaac ccctcgccac 
30151 cgacgactgc tacgcgcgct tcacgacgct cggcttcgcc tacgggccgg 
30201 tcttccaggg cctgcgggcc gcctggcgcg ccggtgacgt gctgtacgcc 
2 0 30251 gaggtggccc tggcggagtc caccggcgac gaggcgaccg ccttcggtct 
30301 gcaccccgca ctgctcgacg ccgccctgca cgcctccctc gtcgcccacg 
30351 agggcgagga gagcaacggc ggactgccgt tctcctggga gggcgcgacc 
30401 ctctacgcga ccggcgccac cgcgctgcgc gtccggctga ccccgacggg 
30451 caccgacggc cgttcggtgg ccatcgccgt ggccgacacc gccggtcgtc 

2 5 30501 cggtcgccgc catcgacaac ctcgtctcgc gccgggtctc cggcgaccag 

30551 ttgaccggcg ccgcgggact ggcccgcgac gccctgttca ccctggactg 
30601 gaaccccgta ccggagaacc tcgtaccgga gaaccctgta ccggagaaca 
30651 ccggcggggg ccacgcccag gaccaggacg gccggcccgc cgcggccacc 
30701 gtcgcgctgg tcggcgcgga cggcaccgcg atcgccgccg acctgaccgc 

3 0 3 0751 cgccggcatc cacaccaccc tccaccccga cctcaccacc ctcgccacga 

30801 ccgacgccga cgttccgaag acggtcctca tccccctcac cggaaccgga 
3 0851 accggaaccg gcaccgggac tgagtcgacg gacggaatcg ggacgggggc 
3 0901 cgccgagtcg gacgcgtccg ccccctcccc ggccgaggtc gcccacaccc 
30951 tgtccaccgc cgcactcgcc ctcgtccagg agtggaccgc acaggagcgc 
35 31001 ttcgccggct cccgcctggc gttcgtcacg accggggcga cggccgccgg 
31051 cggtaccgac gtcatggacg tggccgccgc cgcggtctgg ggcctggtcc 
31101 gatccgccca gtccgaagcc ccggacacct tcgtcctgat cgaccgtgac 
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31151 


cccggcccgg 


ccggcacgca 


cgaccgcaca 


gccgccgccg 


aacggggcca 




31201 


actgctccta 


cgggcactgc 


acaccgacga 


accgcagctc 


gccctgcgtg 




31251 


acggcggcgt 


gctcgccgcc 


cgcctggccc 


gcttcgacac 


cgcggccgcg 




31301 


ctcaccccgc 


cggccgaccg 


ggcctggcgg 


ctcgacagca 


cggccaaggg 


5 


31351 


cagcctcaac 


ggcctcgccc 


tgaccccgta 


tccggcggca 


ctggcgccgc 




31401 


tcaccggcca 


cgaggtgcgg 


gtcgaggtgc 


gtgccgcggg 


cctgaacttc 




31451 


cgtgacgtgc 


tcaacgcgtt 


ggggatgtat 


ccgggtgatg 


atgtcggatc 




31501 


gttcggttcg 


gaggcggccg 


gtgtggtcgt 


cgaggtcgga 


ccggaggtga 




31551 


ccggcctggc 


ccccggcgac 


caggtcatgg 


gcatgatcac 


cggcagcttc 


10 


31601 


ggctcgctcg 


ccgtggacga 


cgcgcggcgc 


ctcgcccgcc 


tgcccgagga 




31651 


ctggtcctgg 


gagacgggtg 


cgtcggtgcc 


gttggtgttc 


ctcaccgcgt 




31701 


actacgccct 


gaaggagttg 


ggtggtctgc 


gggcggggga 


gaaggtgctg 




31751 


gtgcatgccg 


gtgccggtgg 


tgtcggtatg 


gcggcgatcc 


agatcgcccg 




31801 


gcatgtcggt 


gccgaggtgt 


tcgccacggc 


cagtgagggc 


aagtgggacg 


15 


31851 


tgttgcgctc 


gctcggcgtg 


gccgacgacc 


acatcgcctc 


ctcccgcacc 




31901 


ctcgacttcg 


aggcggcctt 


cgccgaagtc 


gccggcgacc 


gcggcctgga 




31951 


cgtcgtcctc 


aactccctcg 


ccggtgactt 


cgtcgacgcc 


tcgatgcggt 




32001 


tgctcggcga 


cggcggccgg 


ttcctggaga 


tgggcaagac 


cgacatccgc 




32051 


gccgcggact 


ccgttcccga 


cggcctctcc 


taccagtcct 


tcgacctcgc 


20 


32101 


ctgggtggtg 


ccggaaacca 


tcggcaccat 


gctggccgag 


ctgatggacc 




32151 


tcttccgcac 


cggcgcactg 


cggccactcc 


cggtccgcac 


ctgggacgtc 




32201 


cggcacgcca 


aggacgcgtt 


ccgcttcatg 


agcatggcca 


agcacatcgg 




32251 


caagatcgtg 


ctcaccctgc 


cccgctcctg 


gaagcccgag 


ggaacggtgt 




32301 


tggtgacggg 


tggtacgggt 


ggtctgggtg 


ggttggtggc 


ccggcatctg 


25 


32351 


gtgcggtcgt 


gtggggtgcg 


gcatttgttg 


ttgaccagtc 


gttctggtgt 




32401 


gggtgctgcg 


ggtgcggccg 


ggttggtcgc 


ggagttggag 


tcgttgggcg 




32451 


cgcgggttgt 


ggttgcggcg 


tgtgatgtgg 


gtgatggctc 


ggctgttgcg 




32501 


gagttggttg 


ccggtgtgtc 


ggagtcgtat 


ccgttgtctg 


cggtggtgca 




32551 


tgcggctggt 


gtgttggatg 


acggtgtggt 


gggttcgttg 


acgccggagc 


30 


32601 


ggttggctgc 


ggtgttgcgt 


ccgaaggtgg 


atggtgcgtg 


gaacctgcat 




32651 


gaggcgacgc 


gtggtctgga 


tctggacgcg 


tttgttgtct 


tctcgtctgt 




32701 


tgcgggtgtg 


ttcgggggtg 


cgggtcaggc 


caactatgcg 


gcgggtaatg 




32751 


cgtttttgga 


cgcgttgatg 


gttcatcggg 


tggctggtgg 


gttgcctggt 




32801 


gtgtcgttgg 


cgtggggtgc 


ttgggatcag 


ggtgtgggga 


tgacggcggg 


35 


32851 


gctgacggag 


cgggatgtcc 


gtcgtgctgc 


tgagtcgggt 


atgccgttgt 




32901 


tgacggttga 


tcagggtgtg 


gcgttgttcg 


atgcggcgtt 


ggcgacgggg 




32951 


agtgccgcgt 


tggtgccggt 


ccgtctggac 


ctggccgcac 


tgcgcacccg 
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33001 


gggcgacatc 


gcaccgctcc 




33051 


gcgcagccgc 


caccacaccc 




33101 


cggctccagc 


gcgccgagcg 




33151 


ccaggccgcg 


atggtcctcg 


5 


33201 


cccgcgcctt 


ccgcgacctc 




33251 


cgcaaccgca 


tcggcgcggc 




33301 


cttcgactac 


cccaccgccg 




33351 


tgctcggccc 


cgacgccgag 




33401 


gcgggaccga 


ccgacgaccc 


10 


33451 


ccccggcgac 


atcggctcgc 




33501 


gcgccgacgt 


cgtcaccgac 




33551 


aacctctacg 


accccgaccc 




33601 


cggcggtttc 


ctgcacgacg 




33651 


tgagcccccg 


cgaggccatg 


15 


33701 


gagtcctcgt 


gggaggcgat 




33751 


gcgcgacagc 


cgcaccggcg 




33801 


gcacccgcct 


cgacggcgcc 




33851 


gcactgagcg 


tggcctccgg 




33901 


cccggccatg 


acggtcgaca 


20 


33951 


acctcgccgc 


acaggcactc 




34001 


ggtggtgtca 


ccgtgatgtc 




34051 


gcagcgcgga 


ctggcccccg 




34101 


ccgacggcgt 


cggctggtcc 




34151 


cagtcggacg 


ccgtgcgcaa 


25 


34201 


ctcggcggtc 


aaccaggacg 




34251 


gcccgtccca 


gcagcgggtg 




34301 


tccacggccg 


acgtggacgc 




34351 


cggtgacccg 


atcgaggccc 




34401 


gcgaccccga 


gaacccgctg 


30 


34451 


cacacccagg 


cagccgccgg 




34501 


gatgcggcac 


ggcgtgctgc 




34551 


cgcacgtcga 


ttggagcgcc 




34601 


gcctggccgg 


agaccggccg 




34651 


catcagcggc 


accaacgccc 


35 


34701 


ggcactcggc 


gccggaagaa 




34751 


ccggccaccg 


cgcacctgcc 




34801 


gccggaggcc 


ctgcgtgccc 
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tccgcggcct ggtcaaggcg cccatccgcc 
ggcgacaccg gactcgccga gcagctcacc 
acgggacacc ctcctcgcgc tcgtccgcga 
gccacacctc gggcgacggc gtcgacccgt 
ggcttcgact cgctcaccgc ggtcgaactc 
caccggcctg cggctaccgg ccacggccgt 
atgccctcgc cgcacacctg ctcaccgaac 
tcggaccccg acgagcccgg cgaccccacc 
catcgtcatc atcggcatga gctgccgctt 
cggaggacct gtggcgcctg ctcggcgacg 
ttcccgacca accgcggctg ggacctggac 
cgcgcacgcc ggcacctcgt acgcccgcac 
ccgccgactt cgacgccgac ttcttcggca 
gccacggact cccagcagcg cctgctgttg 
cgagcgggcc ggcatcgacc cgctgaccct 
tcttcgccgg cgtcatgtac agcggctacg 
gaattcgaag gcttccaggg gcagggcagc 
ccgggtctcc tacaccttcg gcttcgaagg 
ccgcctgctc ctcctcgctg gtcgccctgc 
cgcggcggtg agtgcaccct cgccctcgcc 
catcccggac accttcatcg agttctcccg 
acggccgctc caagccgttc tccgagtccg 
gagggcgtcg gaatgctgct cctggagcgc 
cggccaccag atcctggccg tggtgcgcgg 
gtgcgtccaa cggcctgacc gcgcccaacg 
atccgtcagg cgttggccag cggcggcctg 
cgtcgaggcg cacggcacgg gcaccacgct 
aggccctcct ggccacctac ggccgcgacc 
ctgctcggtt cgatcaagtc caacctcggc 
tgtcgccggc gtcatcaaga tggtcatggc 
cccgcagcct gaacatcacc gagccgtcct 
ggcgccgtcg aactgctcac cgagcagacc 
ggcccgtcgc gccggtatct cctccttcgg 
acgtcatcct ggagcagccg gaggccgcgc 
gccgacacgg cggaggcagc cgccaaggcg 
cgtaatgccg tgggcactgt ccggcaagac 
aggccgcacg cctcctcgcc cacctccagc 
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34851 agcgccccga actcgcaccc 
34901 cgctcccagt tcacccaccg 
34951 ggcgacccgc gcgctgtccg 
35001 cggccctcac cggcacggtc 
5 35051 ggtcagggct cgcaacgtct 
35101 cccggtcttc gccgaggctc 
35151 ccttgcccgc ccaggccggt 
35201 gagttgctga acgagacggg 
35251 ggtggcgctg tttcggctgg 
10 35301 tggccggtca ttccatcggt 
35351 ttctcgttgg aggacgcgtg 
35401 gcaggcgttg ccggccggtg 
35451 acgaagtcat cccgcacctg 
35501 ggcccgacct ccgtggtgat 
15 35551 ggcacaacac ttcgccgacc 

35601 cgcatgcgtt ccactcgccg 
35651 gccgagggcc tgtcctacgc 
35701 gacgggccag gtggccacgg 
35751 tgcgccacgt ccgtgaggcg 
2 0 35801 gaagccgagg gcgtgcggac 
35851 cgccgccatg gccagggaaa 
35901 tcctccgcag gaacatgccc 
35951 cggctccaca ccaccggaac 
36001 gaccggcgcc cgcccggtgg 
2 5 36051 ccttctggcc ctccggcccc 
3 6101 atcgccggcg cgagccaccc 
36151 cgaagagggc ctgttgttca 
36201 ggctggccga ccacgccgtc 
36251 ctgctggaac tcgccctgcg 
30 36301 cgaggaactg acgctcgccg 
36351 tacagaccca ggtccgggtc 
36401 gtcacgatcc actcgcgtcc 
36451 ccacaccggc accgacaccc 
36501 tcgccggcct gccggcgacg 
35 36551 ccgccggcgc acgccgaacc 
36601 ggccggcgaa ggattcggct 
36651 cctggcgccg cgacggcgag 



gccgacatcg ccctgtccct cgccacccag 
ggcagtcgtc ctgagcaccg accgtgacga 
ccctcgccac caccgccgcg tccgacccct 
accatgggac gttgcgcggt gctgttctcg 
gggcatgggg cgtgagttgt acgagcgttt 
tcgatgtcgt gatcgatcac ctggacgccg 
ttgcgtgagg tgatgtgggg cgacgatgtc 
ttggacccag cccgcgctct tcgccatcga 
tggagagttg gggtgtccgt ccggacttcg 
gagatcgcgg cggcgcatgt cgtcggggtg 
ccgtctggtg gccgcgcggg cgacgctgat 
gcgcgatgat cgcggtccag gcgaccgagg 
accgacgagg tggcgatcgc ggccgtcaac 
ctcgggcgca gaagaggcca cgcagaccgt 
aggggcgccg gacgaccgcg ctgcgggtct 
ctgatgatgc tggcggagtt ccgtgcggtg 
caccccgacc ctccccgtcg tctcgaatct 
ccgacgaact ctgctcggcc gagtactggg 
gtccgcttcg ccgacggtgt gacggccctc 
cttcctggaa ctcggcccgg acggcgtcct 
ccgtcgccga cgacacggtc accgtccccg 
gaggaacgga ccctgctcac cgcactcggc 
cccgatcgac tgggccgccc tcctggcccc 
acctgccgac atacgcgttc caacaccgtc 
cgcgacaccg cggatgccgc cgccgtcggc 
gctcctcaac ggcatcgtcg aactcgccga 
ccggacggct gtcactgcag tcgcatccgt 
atgggacagg tcctgctgcc cggcaccgca 
cgccggcgac gaggtcggct gtgaccacgt 
caccgctcgt cctgcccgag cgcggcgcgg 
ggcgtcgccg acaccaccgg gcgccgcacc 
cgcacgcgcc acgaccaccg acagtgacac 
cgtggaccca acacgccacc ggcgtcctcg 
gcaaccgtcc cgttcgatgc caccgtgtgg 
cgttgacctg gcggacttct acgcgtcccg 
acgggcccgc tttccagggc ctgcgagcgg 
gtgttcgccg atgtcgcact gccggaggcc 
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36701 


ggccgtaccg 


aagccgaggc 


gtacgggctg 


catccggcac 


tgctcgacgc 




36751 


cggactgcac 


gcagcctggc 


tcgtcgcccc 


ggacggggag 


cccacacgga 




36801 


cgggcagcgt 


gccgttctcg 


tggcgcggcg 


tcttcctggc 


cgcttccggt 




36851 


gcctcctcgg 


tccgcgtccg 


actcggccgc 


gactccgacg 


gaacgctgag 


5 


36901 


cctggccatc 


gccgacacca 


ccggtgcacc 


ggtcgcgtcc 


gtacaggccc 




36951 


tctccatgcg 


caccgtctcg 


gtgacggccc 


tgagcgccac 


cgcgggcctc 




37001 


gcccgcgacg 


cgctgttccg 


cctggactgg 


gcctcggccc 


cggagccggc 




37051 


gtgccagccg 


gacgacacgg 


tgaccgtgat 


cccggcggtc 


gcggtcgtcg 




37101 


gtacggaaac 


ctccgaactc 


acctccgagc 


tcaccgcggc 


cctgcgtgcg 


10 


37151 


gcgggcgccg 


acgtcgacgt 


ccgcacgacc 


ctgtcgaccg 


atgaacccgc 




37201 


gcccgcgctg 


atcgcgctgc 


cgctcgtcgc 


ctccgaccag 


accggcaccg 




37251 


cagaagcggc 


acccgtcccg 


gcggccgtgc 


acgacctcac 


ccgccgagcc 




3 7301 


ctggccctcg 


tacagacccg 


cctgcaagag 


cagcacttcg 


cggacacgaa 




37351 


gttcgtcttc 


gtgacccgtg 


gtgcgacggt 


cgggcgtgat 


gtggctgctg 


15 


37401 


ctgcggtgtg 


gggtctggtg 


cgttcggcgc 


agtcggagaa 


tccgggttgt 




37451 


tttgctctgg 


tcgatctgga 


tccggatggt 


gcggtgggtg 


cggctgcgct 




37501 


cgtcgctgcg 


ttggtcagtg 


gtgagccgca 


gcttgcggtg 


cgcggtgatg 




37551 


tgttgcgggt 


cgcgcgtctg 


gtgcggcggc 


cgctcaccga 


ggtcggtgcg 




37601 


ggtgctgatg 


gcaccgggga 


tggcgtcggg 


ggtggctctg 


gtgtgtcgtt 


20 


37651 


ctcgggtgag 


ggtgcggtcc 


tggtcactgg 


tggtacgggt 


ggtctgggtg 




37701 


cggtgttggc 


gcgtcatctg 


gtggccgagt 


atggggtgcg 


ggatctgctg 




37751 


ttggtcagtc 


gcagtggtga 


acgtgccgtg 


ggtgctgggg 


agttggtggc 




37801 


ggagcttgcg 


ggtgtgggtg 


cgcgggtgcg 


ggtggttgcg 


tgtgatgtga 




37851 


ccgatcgtgc 


cgcggtggtg 


gagttggttg 


gcgggcatgc 


ggtgtccgcg 


25 


37901 


gtggttcatg 


cggctggtgt 


gctggatgac 


ggcatggtgg 


gtgcgttgac 




37951 


cggggagcgg 


ttgtccgcgg 


tgctgcggcc 


gaaggtggat 


gctgtctggc 




38001 


atctgcatga 


ggcgacccgc 


ggcctggatc 


tggacgcgtt 


cgtcgtcttc 




38051 


tcctctctcg 


ccggggtctt 


cggcagtccc 


ggccaggcca 


actacgcggc 




38101 


cgcgaacgcc 


ttcctggacg 


cgctgatgac 


gcggcgccgg 


gcggagggac 


30 


38151 


tgcccggcct 


gtcactcgca 


tggggaccgt 


ggtcgctgac 


cgatggcaca 




38201 


tcgggcatgc 


tcgcggacgc 


cgaggccgat 


cgcctgaccc 


gttcgggagt 




38251 


gccaccgctg 


accgcggagc 


aaggactggc 


actgttcgac 


gcggccctgg 




38301 


cgaccggtga 


cgccacctgc 


gtcccggtcc 


gcctggacct 


gtcggcactg 




38351 


cgtgcccagg 


gtgaggtgcc 


gcccttgctg 


cggtccctga 


tccgaggccg 


35 


38401 


ctcgcgccgc 


gccgccgccg 


cggaatcggc 


aaccgccacc 


gggctgcggg 




38451 


. aacggctcgt 


cggactgaac 


ccggtcgagc 


gacaggaagt 


cctcctggac 




38501 


ctcgtgcgcg 


gccaggtggc 


cctggtcctc 


ggccacgccg 


acgccgacga 
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38551 


cgttcatccg 


gcacgtgcct 




38601 


cggtggaact 


gcgcaaccgc 




38651 


gcgaccatgg 


tgttcgacta 




38701 


cctggacgag 


ttgctgggca 


5 


38751 


ccgcggttgc 


ggtggcggac 




38801 


cgctaccccg 


gcggggtcgc 




38851 


ggacggcgtc 


gacgcggtgt 




38901 


tcgaatccct 


ctatcacccg 




38951 


cgctcgggtg 


ggttcctgca 


10 


39001 


cgggatgagt 


ccgcgggagg 




39051 


tgttggagtc 


gtcgtgggag 




39101 


agtttgcggg 


gtagtcggac 




39151 


ttacagcgcg 


atgttggcga 




39201 


ggagttcgcc 


gagtttggcg 


15 


39251 


gaaggcccgg 


cggtgacggt 




39301 


gatgcactgg 


gcgatgcagg 




39351 


tggccggtgg 


tgtgacggtg 




39401 


gctcggcagc 


ggggtttgtc 




39451 


tgcggccgat 


ggtgtgggct 


20 


39501 


agcggcagtc 


ggacgcggtg 




39551 


cggggttcgg 


cggtcaacca 




39601 


caatggtccg 


tcgcagcagc 




39651 


gtctgacggc 


cggtgacgtg 




39701 


acgctcggtg 


atccgatcga 


25 


39751 


ggatcgtgag 


cctgagcggc 




39801 


tggggcatac 


gcaggctgct 




39851 


ttggcgatgc 


ggcatggtgt 




39901 


ttcttcgcat 


gtggactggt 




39951 


aggcggcttg 


gccggagacg 


30 


40001 


ttcggcatca 


gcggcaccaa 




40051 


cgcgcggcgt 


ccggtgatgg 




40101 


cgtgggtgct 


ttccggcaag 




40151 


aaactgttgt 


cgtcgatcga 




40201 


gggaatgtcc 


ctggtcaccg 


35 


40251 


tactcgccgc 


cgaccgtgcc 




40301 


gccgacgagg 


ccgatgctgc 




40351 


tcacgcggtg 


ctgttctcgg 



tcagggagtt gggcttcgac tccctcacct 
ctcaacaccg tcaccggtct gcggctcccc 
tccgaccgtc gaagtgctgg tctcctacgt 
cggatgccga ggtggcgacc gtgcagccgg 
gatccgatcg tcatcgtggg catggcctgc 
ctctccggac gacctgtgga ggctcgtcac 
cccccttccc gaccaaccgt ggttgggacg 
gaccccgacc atctcggtac ctcctacacg 
tgaggcgggg gagttcgatc cggggttctt 
cgttggcgac cgattcccag cagcggttgt 
gcgatcgagc gggccggtat tgatccggtg 
gggtgtgttc gcgggggtga tgtacagcga 
gtccggagtt cgagggtttc cagggcagtg 
tcgggtcggg tggcctacac gttggggttg 
ggatacggcg tgttcgtcgt cgttggtggc 
cgttgcgtag tggtgagtgt gggttggcgt 
atgtcgacgc ctgcggtgtt tgtggacttt 
gccggatggt cggtgcaagg cgtttgcgga 
ggtccgaggg cgtcggcgtg ttggtcctgg 
cgcaatggtc acgagatttt ggctgtggtg 
ggatggtgcg tccaatggtt tgacggcgcc 
gggtgatccg gcaggcgttg gccagtggtg 
gatgtggtgg aggcgcatgg tacgggtacg 
ggcgcaggcg ttgttggcga cgtatgggcg 
cgttgttgtt gggttcggtg aagtcgaatc 
gcgggtgtgg cgggcgttat caagatggtg 
ggtgccgcgg acgttgcatg tggatgcgcc 
ccgagggtgc ggtggagctg ctcagtgagc 
ggtcgggtgc ggcgggcggg tgtctcctcc 
cgcgcacgtg attctggagc gtccggaggc 
agacgaacac cgtggagccg tccacggtgc 
acaccggagg ccctgcgcgc ccaggcggcg 
ggaacgcccg gagcttcgcc tcgtcgacgt 
gccgctcgac ctttgaacac cgtgcggtgg 
gacgccgccc gcgcattgtc ggcgattgct 
cgctgccacc ggccgcgtcg gcgcgggtcg 
gtcagggtgc tcaacgtctg ggcatggggc 
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40401 


gtgagttgta 


cgagcgtttc 


ccggtcttcg 


ccgaggctct 


cgatgtcgtg 




40451 


gtcgaccacc 


tggacgccgc 


cttgcccgcc 


caggccggtt 


tgcgtgaggt 




40501 


gatgtggggc gacgatgcgg 


agttgctgaa 


cgagacgggt 


tggacgcagc 




40551 


cggctctgtt 


cgccatcgag 


gtggcgctgt 


tccggctggt 


ggagagttgg 


5 


40601 


ggtgtccgtc 


cggacttcgt 


ggccggtcat 


tccatcggtg 


agatcgcggc 




40651 


ggcgcatgtc gccggggtgt 


tctcgctgga 


ggacgcctgc 


cgcctggtgg 




40701 


ccgcccgtgc 


gacgctgatg 


caggcgttgc 


cggccggcgg 


tgcgatgatc 




40751 


gcggtccagg 


cgaccgagga 


cgaagtcacc 


ccgcatctga 


ccgacgacgt 




40801 


agcgatcgcc 


gccatcaacg 


ggccgaacgc 


actggtcgta 


tcgggtgtgg 


10 


40851 


aggatgccgc 


cgtcgagatc 


ggggcgcggt 


tcgcggccga 


ggggcgtcgc 




40901 


acgacccgac 


tccatgtgtc 


gcatgcgttc 


cactcgccgt 


tgatggatcc 




40951 


gatgctggcg gagttccgtg 


tggtggcgga 


gggcctgtcc 


tatgctgctc 




41001 


cgtccctccc 


cgtcgtctcg 


aatctgacgg 


gccaggtggc 


cacggccgac 




41051 


gaactgtgct 


cggccgagta 


ctgggtgcgc 


cacgtccgcg 


aggcggtccg 


15 


41101 


cttcgccgac 


ggggtgacgg 


ccctcgaagc 


cgagggcgtg 


cggaccttcc 




41151 


tggaactcgg 


cccggacggc 


gtcctcgccg 


ccatggcagg 


agcctcgctc 




41201 


accgaatcct 


ccctcgcggt 


accgctgctc 


cgtaaggacc 


ggccggagga 




41251 


accggcggca 


ctcgccgccc 


tggcccagtt 


gcacatcgcc 


ggcgcgcgcg 




41301 


tcgattggcc 


cgtgctcttc 


gctggtgtgg 


gtgcggggcg 


ggtggagttg 


20 


41351 


ccgacgtatg 


cgttccagcg 


tgggtggttc 


tggccggttg 


gtcgggttgg 




41401 


tgttggtggt 


gatgtgggtg 


ctgtggggct 


tgggtctgcg 


gggcatccgt 




41451 


tgttgggtgc 


tgcggtggag 


ttggctgcgg 


gtgcgggggt 


ggtgttgacg 




41501 


ggtcgtctgt 


cgttgtcgtc 


gcatggttgg 


ttggctgatc 


atgcggtgat 




41551 


ggggcgggtg 


tttgttcctg 


gtacggcgtt 


gctggagatg 


gtgatgcgtg 


25 


41601 


ctggtgatga 


ggtggggtgt 


ggtcgtgttg 


aggagctgac 


gttggcggcg 




41651 


ccgttggtgt 


tgcctgagcg 


tggtggggtg 


cgggttcagg 


ttgctgtgga 




41701 


tgctcctgat 


gctgcgggtc 


gtcgtggtgt 


gggggtgtat 


tcgtgtcctg 




41751 


atggtgtggg 


tcaggcggtg 


tggtcgcagc 


atgctgtcgg 


tgtgttggcc 




41801 


tctggtgtgg 


ctgaccaggt 


cggtgggttc 


ggtgacggtg 


gtgtgtggcc 


30 


41851 


gccgcagggt 


gcggtgtcgg 


tggatgctga 


gggctgctac 


gagctgtttg 




41901 


cggatgctgg gttcggttat 


ggcccggtgt 


tccaggggtt 


gcgtgcggtg 




41951 


tggcgtcgcg gcgaggaact 


cttcgcagag 


gtcgccctgt 


cggacgaggt 




42001 


tgctgagagc 


gctgatacgg 


cgaccggttt 


cgggttgcac 


ccggcgttgc 




42051 


tggatgcctc 


gctccatgcc 


tcgcttctct 


cctcccttga 


aggtcaatcc 


35 


42101 


gccgatggtg 


ggcctgcgtt 


gccgttcgcg 


tgggagggtg 


tttccctctt 




42151 


cgcctcgggt 


gcgacggctt 


tgcgcgtgcg 


gttggcgccg 


gcgggtgagc 




42201 


atgcggtgtc 


ggtgaccgcg 


gtggatccga 


ccggtgcgcc 


ggtgatttcc 
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42251 


atcgacgcgc 


ttcgtacccg 


tcgcctcacc 


ctcgatgagg 


tcaacgcatc 




42301 


tcacacccag 


ctgagcgatg 


cgctcttcgg 


cgtccaatgg 


accacggtcc 




42351 


cgagcacccc 


ggccgccgac 


cacccgtcgg 


tagccatcat 


cggaaccgac 




42401 


cacctggggc 


tcgccgaagc gctcagcagt 


tcctctgctg gcgcgaccac 


5 


42451 


gaccacgacc 


gcggccgcgt 


acgagagcct 


ugacgcgccy 


atcycgy<_yg 




42501 


ggcccgaagt 


gtccgtccct 


gacgbcacac 




cz\ c c ci ^ n 




42551 


gacgccatcg 


ctcagtacgt 


gaacgaccac 


gacgccacgg 


t" firtf *** 




42601 


aggcaccatc 


ggagccggcg 


cggcggccgt 


agacgegget 


cy t-t-y UL-cl 




42651 


ccgccgaggc 


cctgcgcacg 


atccaggcat 


yy L.y y L-L-y d 


y y y ^* 


10 


42701 


gcggcaaggc 


gcctggtcct 


cgtgacccgc 


ggcgcggccg 


ct^yyy^ctyyct 




42751 


tgtcgccgcg 


gcggccgtcc 


agggcctggt 


gege tccgcc 


-5 /-r a ^ O .£3 
L-ciycn— yaya 




42801 


acccgggcac 


cttcggcctc 


ctcgacctcg 


aegggae cga 


yy cy LuyctLw 




42851 


gcggtcctcg 


gcgaggctct 


cacctccgac 


gaaccgcaac 


^ « ♦* 4- ^ ^* 

ugctLCtycg 




42901 


cgacggacac 


ctgcacgccg 


ccaggctgac 


ccgcctggcg 


tcgcccgccg 


15 


42951 


acacggccgt 


gcccacggag 


tggaacgcgg 


aeggcaeggt 


gctgatcacc 




43001 


ggtggtaccg 


gcgggctcgg 


cgcgcagttc 


gcacggcacc 


teg tcgacag 




43051 


gtacggcgtc 


cgcaatctcc 


tgctcgtcag 


ccggcgcggc 


cccgatgccc 




43101 


cgggaaccac 


ggagttggtc 


gccgagctga 


cggcgcacgg 


tgccgaggtg 




43151 


gccgtgcagg 


catgtgacgt 


ggccgatggc 


gatgcggtgg 


cggcgttggt 


20 


43201 


cgccggcgtg 


ccggatgagc 


acccgctgag 


ggcggtcgtc 


cacacggccg 




43251 


gtgtgttgga 


cgacggagtg 


atcggctcgc 


tcaccgagga 


gcggctcgcc 




43301 


accgtcctgc 


ggcccaaggc 


ggatgccgcc 


tggcatctgc 


aegaagegae 




43351 


ccgcggcctg 


gacctggacg 


cgttcgtcgt 


cttctcctcc 


gtcgcagggg 




43401 


tcttcggcgg 


cgccggccag 


gccaactacg 


ccgcggccaa 


cgccttcctg 


25 


43451 


gatgcgctga 


tggcccagcg 


ccgggcagcg 


ggcctgcccg 


gactcccgt t 




43501 


ggcctggggg 


ccgtgggacc 


agaccggcgg 


aatgaeggge 


atgctgtcgg 




43551 


acgccgaggc 


cgaccgcctc 


gcccgctccg 


gcatcccgcc 


gctctccgcg 




43601 


gagcagggcc 


tcgccctctt 


cgacgcggca 


ctcgctcttg 


ccggaaccag 




43651 


cacgccagac 


agggcagccg 


gcagcgccgc 


cgccagcacg 


tegggaaccg 


30 


43701 


gcgacacgat 


cgccatcccg 


gccgcggccc 


tcgtcgcacc 


ggtccggctc 




43751 


gacctggcgg 


cgctggccgc 


gcagggtgag 


gtccccgcga 


ttctgegegg 




43801 


actggtgcgg 


acccgcaccc 


gccgtacggc 


ggccggcggt 


tccgtcaccg 




43851 


tggccggact 


cgtcaaccgc 


ctgtccgggc 


tcaccgccga 


cgagcggcgc 




43901 


caggaacttc 


tcgaactggt 


ccgcactcag 


gcagccctcg 


tcctcgggca 


35 


43951 


cgccgatccg 


gcgtccgtgg 


actccaccgc 


acagttccgt 


gacctcggct 




44001 


tcgactcgct 


gaccgccgtc 


gagctgcgca 


aceggctgag 


cacggccacc 




44051 


ggcctgcgcc 


tgaccgcaac 


cctggtcttc 


gactacccga 


acaccgatgc 
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44101 


cctcgcggag 




44151 


tgcgggtgcc 




44201 


gtggtgggca 




44251 


cctgtggcgc 


5 


44301 


ccaaccgcgg 




44351 


ctcggcacct 




44401 


gttcgacccg 




44451 


actcccaaca 




44501 


gccggcatcg 


10 


44551 


cggcgtgatg 




44601 


agggcttcca 




44651 


tcctacaccc 




44701 


ctcctcctcc 




44751 


gtgagtgcac 


15 


44801 


ggcacgttcg 




44851 


ttccaaggcg 




44901 


tcggcatcct 




44951 


gagatcctcg 




45001 


caacggcctg 


20 


45051 


aggcgttggc 




45101 


gcgcacggca 




45151 


cctggccacc 




45201 


gttcgatcaa 




45251 


ggtgtcatca 


25 


45301 


cctgcatgtc 




45351 


tcgaactgct 




45401 


cgcgccggtg 




45451 


cctggaacag 




45501 


ccgtggagcc 


30 


45551 


tgggcgctgt 




45601 


cctccgtgac 




45651 


gccactcact 




45701 


ctggtcgacg 




45751 


cctggcctcc 


35 


45801 


ccggcggttc 




45851 


atggggcgtg 




45901 


tgtcgtggtc 



cacctgcggg acgagttgtt 
ggtccaggca ctgccgccga 
tggcctgccg tttccccggt 
ctggtcgacg ccggcaccga 
ctgggacctc gaatcgctct 
cctacacccg ctccggtggc 
gcgttcttcg gaatgagccc 
gcgtctcctg ctggaatcct 
acccgctgac cctgcgcggc 
tacagcgact acgggagcat 
aggccaggga agtgcgggca 
tcggcttcga aggtcccgcc 
ctggtcgccc tgcacctggc 
gctcgcgctc gccggtggtg 
tggagttctc gcggcagcgg 
ttcgccgagg ccgcggacgg 
cgtcctggag cgccagtcgg 
ccgtgatccg cggctcggcg 
accgcgccca acggcccgtc 
cagtggcggc ctgtccacgg 
cgggtacgac gctcggtgac 
tacggccgcg accgcgaccc 
gtccaacctc ggccacaccc 
agatggtcat ggcgatgcgg 
gacgcgccgt cctcgcacgt 
caccgagcag accgtctggc 
tctcctcctt cggcatcagc 
ccggaggccg tgcagcgcct 
ggtcgccatc aagccgtcgg 
ccggcaagtc acccgaggcc 
ttcctggcgg aacggcccga 
ggccgtcaca cgctcgcagt 
atgcgaaggc cccggcggac 
ggtgtggccg atcccgccgt 
ggcagtgctg ttcacaggtc 
agttgtacgg ccgtttcccg 
gaccacctgg acgccgcctt 



cggcgcggtc gagagcgagg 
ccgccgacga tcccatcgtg 
ggtgtgacct cgcccgagga 
cgccatcacc accttcccga 
acgacccgga ccccgcacac 
ttcctgcacg aggcggggga 
gcgtgaggcg ctggcaaccg 
cctgggaggc gatcgagcgg 
agcgccaccg gcgtcttcgc 
cctcggcggc aaggagttcg 
gcgtggcctc gggccgcgtc 
gtcaccgtgg ataccgcctg 
agcccaggcc cttcgggcgg 
tgacggtgat gtccacgcca 
ggtctggcgc ctgatggtcg 
cgtcggctgg tccgagggcg 
acgccgtgcg caacggccac 
gtcaaccagg acggtgcgtc 
ccagcagcgc gtcatccgtc 
ccgacgtgga cgccgtcgag 
ccgatcgagg cccaggcgct 
cgagaacccg ctgctgctcg 
aggcagcggc cggtgtcgcc 
cacggcgtgc tgccgcagac 
cgattggagc gtcggcgccg 
cggagaccgg ccgggtccgt 
ggcaccaacg cccacgtcat 
ggcaccggga gcagcagaga 
cggaaccgtc cctggtgccg 
ctgcgcgccc aggccgcacg 
accgcgctcg atcgacatcg 
tcgaccaccg cgcgatcgtg 
agcctggccg ccctcgcggc 
cgtctccgac gcggtatcga 
agggtgctca acgtctgggc 
gtcttcgccg aggctctcga 
gcccgcccag gccggtttgc 
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45951 


gtgaggtgat 


gtggggcgac 




46001 


acccagcccg 


cgctcttcgc 




46051 


gcgttggggt 


gtccgtccgg 




46101 


tcgcggcggc 


gcatgtcgcc 


5 


46151 


ctggtggccg 


cccgtgcgac 




46201 


gatgatcgcg 


gtccaggcca 




46251 


acgaggtggc 


gatcgcggcc 




46301 


ggcgcagaag 


aggccacgca 




46351 


gcgccggacg 


accgcgctgc 


0 


46401 - 


tggatccgat 


gctggcggag 




46451 


gccaccccgt 


ccctccccgt 




46501 


ggccgacgaa 


ctgtgctcgg 




46551 


cggtccgctt 


cgccgacggc 




46601 


accttcctgg 


aactcggccc 


5 


46651 


gtccctcgcc 


ggcgaagccg 




46701 


gtgaggagtc 


cacggccctg 




46751 


ctgatcgaag 


actggcagga 




46801 


ggagttgccg 


acgtatgcgt 




46851 


gggttggtgt 


tggtggtgat 


0 


46901 


catccgttgt 


tgggtgctgc 




46951 


gttgacgggt 


cgtctgtcgt 




47001 


cggtgatggg 


gcgggtgttt 




47051 


atgcgtgctg 


gtgatgaggt 




47101 


ggcggcgccg 


ttggtgttgc 


5 


47151 


ctgtggatgc 


tcctgatgct 




47201 


tgtcctgatg 


gtgtgggtca 




47251 


gttggcctct 


ggtgcggctg 




47301 


tgtggccgcc 


gcagggtgcg 




47351 


ctgtttgcgg 


atgctgggtt 


0 


47401 


tgcggtgtgg 


cgtcgcggcg 




47451 


acgaggttgc 


tgagagcgct 




47501 


gcgttgctgg 


atgcctcgct 




47551 


tcaatccgcc 


gatggtgggc 




47601 


ccctcttcgc 


ctcgggtgcg 


5 


47651 


ggtgagcatg 


cggtgtcggt 




47701 


gatttccatc 


gacgcgcttc 




47751 


acgcatccca 


cacccagctg 



gatgtcgagt tgctgaacga gacgggttgg 
cgtcgaggtg gcgctgttcc ggctggtgga 
acttcgtggc cggtcattcc atcggtgaga 
ggggtgttct cgctggagga cgcctgccgt 
gctgatgcag gcgctgccga ccggcggcgc 
ccgaggacga agtgaccccg cacctgaccg 
gtcaacggcc cgacctccgt ggtgatctcg 
gaccgtggca caacacttcg ccgaccaggg 
gggtctcgca tgcgttccac tcgccgctga 
ttccgtgcgg tggcggaagg actgtcctac 
cgtctcgaat ctgacgggct ggctggccac 
ccgagtactg ggtgcgccac gtccgcgagg 
atcaccaccc tcgaagccga gggcgtgcgg 
ggacggcatc ctgtccgcgc tggctcagca 
tcaccgtgcc cgtcctgcgc aaggaccgcg 
acggcccgag cgcatctcca cacccgcgga 
cttcttcgct ggtgtgggtg cggggcgggt 
tccagcgtgg gtggttctgg ccggttggtc 
gtgggtgctg tggggcttgg gtctgcgggg 
ggtggagttg gctgcgggtg cgggggtggt 
tgtcgtcgca tggttggttg gctgatcatg 
gctcctggta cggcgttgct ggagatggtg 
ggggtgtggt cgtgttgagg aactgacgtt 
ctgagcgtgg tggggtgcgg gttcaggttg 
gcgggtcgtc gtggtgtggg ggtgtattcg 
ggcggtgtgg tcgcagcatg ctgtcggtgt 
accaggtcgg tgggttcggt gacggtggtg 
gtgtcggtgg atgctgaggg ctgctacgag 
cggttatggc ccggtgttcc aggggttgcg 
aggaactctt cgcagaggtc gccctgtcgg 
gatacggcga ccggtttcgg gttgcacccg 
ccatgcctcg cttctctcct cccttgaagg 
ctgcgttgcc gttcgcgtgg gagggtgttt 
acggctttgc gcgtgcggtt ggcgccggcg 
gaccgcggtg gatccgaccg gtgcgccggt 
gtacccgtcg cctcaccctc gatgaggtca 
agcgatgcgc tcttcggcgt ccaatggacc 
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47801 


acggtcccga 


gcaccccggc 


cgccgaccac 


ccgtcggtag 


ccatcatcgg 




47851 


aaccgacccc 


ttcggcctcg 


cagacggcct 


ttcggacgcc 


ttgcccctgg 




47901 


tcgaggagcg 


cggtgacctc 


gcggcgctcg 


cagcgtcgga 


gcacccggta 




47951 


ccggacctcg 


tcctcgtccc 


ggtagcgggc 


acccggcgca 


caggcgtacc 


5 


48001 


tgcggacgcc 


gaaggacaca 


ccgacgccgg 


gacatccgac 


atgctccgat 




48051 


ccgtgcgtga 


ggccaccgca 


caggtactgg 


agcagatcca 


gcagtggttg 




48101 


gcggacgacc 


ggttcgaggc 


ggcgcggctg 


gtgttcgtga 


cgcgcggggc 




48151 


ggtttccgtg 


ggtgagggcg 


gcatcgccga 


cctggcggcc 


tcggccgtct 




48201 


ggggtctggt 


gcggtcggcg 


cagtcggaga 


atccgggctg 


cttcggtctt 


10 


48251 


ctcgacctcg 


acctcgacct 


cgcccttgac 


tccgaccttg 


cccccgaggt 




48301 


cgacatcgag 


cgcgaccgtg 


accgcgatcc 


ggtcggtggg 


accgtgcagc 




48351 


ccgcgctcgc 


cgcggccctg 


cacgcgaccg 


ccgacgagcc 


gcagttggca 




48401 


ctgcgcggcg 


ggaccgtgca 


ggccgcccga 


ctgacccgaa 


tccccgcgcc 




48451 


gcagaccgac 


cgtgccgaga 


ccgaccctgc 


cgagaccgac 


cgtccggaga 


15 


48501 


tcgacacccg 


gcggcccggc 


acggtgctca 


tcaccggtgg 


taccggtggc 




48551 


ctcggtgggt 


tgctcgcccg 


gcacctcgtc 


gccgagcggg 


gggtacggag 




48601 


cctggtgctc 


gccagccgga 


gcggtctcgc 


ggccgaggga 


gcggagaagc 




48651 


tggtcgccga 


cctcgaagcg 


ctcggtgccg 


tggtggccgt 


gcagacgtgt 




48701 


gatgtggccg 


atggcgatgc 


ggtggcggcg 


ttggtcgccg 


gcgtgtcgga 


20 


48751 


cgagtacccg 


ctgacggcgg 


tcgtccacac 


ggccggtgtg 


ttggacgacg 




48801 


gagtgatcgg 


ctcgctcacc 


gaggagcggc 


tcgccaccgt 


cctgcggccc 




48851 


aaggcggatg 


ccgcctggca 


tctgcacgag 


gcgacccgcg 


atctggacct 




48901 


ggacgcgttc 


gtcgtcttct 


cctccctcgc 


cggcgtcctc 


ggtggcgccg 




48951 


gtcaggccaa 


ctacgcggcg 


gcgaacacgt 


tcctggacgc 


cttgatggcg 


25 


49001 


cagcgtcgcg 


ccgccgggtt 


gccgggtgtg 


tcgctggcgt 


ggggtccgtg 




49051 


ggaccgggcc 


ggcggcatga 


cggggaccct 


gtcggacgcc 


gaggccgacc 




49101 


gcctcgcccg 


ctccggtgtt 


ccgccgatct 


cggcggagca 


gggccttgcg 




49151 


ctgtacgacg 


cggcgaccgc 


cggtgagcgg 


ccgctggtgg 


tgccggtgcg 




49201 


gctggacctc 


gccgcgctcc 


gcgggctcgg 


tgatgtcccg 


gcgctgctgc 


30 


49251 


gcggactggt 


ccggacgccc 


gcgcggcgga 


ccgcggcggc 


cggtgcggcg 




49301 


ccgtcggccg atgtgctcac 


ccggcagttg 


gccgggctcg 


gcggggcgga 




49351 


gcaggaggag gtcctgctga 


ggctggtgcg 


cggtcaggcc 


gcggtggtgc 




49401 


tcgggcacgc 


cgacggctcg 


gcgatcggtg 


cggggcgaca 


gttccaggag 




49451 


ttgggcttcg 


actcgctgac 


cgcggtggag 


ttccgcaacc 


gactcaacgc 


35 


49501 


ggccaccgga 


ctgcggctgc 


cggccaccct 


gctgttcgac 


tacccgacgc 




49551 


cggccgacgt 


cgtcgggcac 


ctgcgcggcc 


ggctcggcac 


cggggaggtg 




49601 


tcgggtgcgg gctcggtgct 


ggcggcgctg 


gacaaccttg 


aggcggtgat 
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49651 


cgccgggctg 


tccctcgacg 




49701 


ggctggaggt 


cctcagggcg 




49751 


gctgtggacg 


gcggtgcgga 




49801 


catgttcgcg 


ctgctggacg 


5 


49851 


atgagcagtt 


ccgcaccaga 




49901 


tggtccacag 


tgagctgttc 




49951 


gtgagccgtt 


ccgcttaggc 




50001 


tgattccggc 


caccgggagc 




50051 


agaactaggg 


aaacacccac 


10 


50101 


tcggaaaacc 


attcgcgggc 




50151 


acggcagtga 


cggcgtgcac 




50201 


gtctccccgc 


tgctgcccgg 




50251 


ggtcccccca 


tgaccacgtc 




50301 


ccatccagca 


ccggccgccc 


15 


50351 


gcggctcggc 


ctccttctac 




50401 


gccgaggtgt 


tcgccatcca 




50451 


agccggtgtc 


agtgacctcg 




50501 


tgcgccccct 


gctgaaggag 




50551 


ggcgcgacgc 


tggccttcga 


20 


50601 


tgacctggtc 


cggctgttcg 




50651 


gtgaagaggc 


cgtgcaccgg 




50701 


aagctgctcg 


ccggcaccaa 




50751 


gcggatgatc 


ctgcccgcga 




50801 


accgctgccc 


gcccgacgtc 


25 


50851 


ggcgaccgcg 


acccgaagac 




50901 


ccacaccacc 


ggggacttcg 




50951 


tcgtcagctc 


cgaggccccg 




51001 


gccggcaacg 


gctagcgggc 




51051 


ccgcacggcc 


cccgcgccgc 


30 


51101 


gcgttgcgtg 


cgcctgtgga 




51151 


ggaggacgtg 


ctcggccggc 




51201 


cgtcaaatgt 


ctgacagctg 




51251 


atcttgtggt 


ggggcatgta 




51301 


atccatcccg 


accgtccgga 


35 


51351 


tacgcgtcgg 


cgcgaggtcg 




51401 


cgtgatgcgg 


aagcagtcgg 




51451 


gacgcgacga 


cgaactgcgc 
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acgcggggga gcaccagttg gtggccggcc 
aagtgggcgg acatgcgaag cgcggaggga 

cgtcgacatc gaggaggcgt cggacgacga 
acgagctggg gctgaactga gccgctccgc 

agttccacag tggcctggtc cacagtgacc 

cacagcgacc tgttccgcgg cggttccgta 

gggaaaccgg ggccccggtg gtcggacgct 

gtcaaaccgg cctcttctaa agaaagggaa 

agcccgatca tgtgcaatga attccgtggc 

catggagtta cggtgtgatc acgtgtccgt 

gggttgccac ggcaggtccc cggccgaggc 

cggtgcgtgc cgcttggttc gtccctagga 

caccgaggag agcctgtggg cccggtgctt 

ccgtccggct cttctgtttc ccacatgcgg 

ttcccggtgt cggcccaact gtcctcggtt 

gtacccgggg cgccaggacc ggcgcaagga 

cgaccttggc cgaccaggtc tacgacgcgc 

cggccgagca cgttcttcgg gcacagcatg 

ggtggcccgg cgcttcgagg ccgacgacgg 

cctccgggcg ccgggccccc tcccgcgtgc 

cggtccgacg acggcatcgt cgaggagctg 

caccgcgctg ctcggcgacg aggagatcct 

tccgcagcga ctaccaggcc atcgagacct 

accgtccggg cgccgctgac cgtcctcacc 

ctccctggac gaggccgagg cgtggcgcgg 

acctcaaggt gcttcccggt gggcacttct 

gcgatcatcg atctgctccg ggcgcacctc 

gcactgcggc aggccggcgg tgccggtctg 

ctgagacggc accatgccac ccggcgacgc 

gcaccgtccc gcgtgcgtac gcgggcgcgc 

cggatgtcct ggaggagacc attcggaaaa 

ctgctcattt actccagccc gcactatatg 

cggagcgtgc cactcgttcg ctcctggtgc 

tttgtgtgca ccggacggcc gtggtcctgg 

gctcctgttc tcaccccacg tggaggaacg 

gttcttccgg cttactgacc acgctggtgg 

accctcgccc ggcacgccgc ggccgcccgc 



SUBSTITUTE SHEET (RULE 26) 



WO 01/59126 



29 



PCT/GB01/00509 





51501 


gacgggcggg 


ccggcctggt 




51551 


gaccagtctg 


ctgcggtcct 




51601 


cggtgctgta 


cggcacctgc 




51651 


ggtgtgcgcg 


aactcctcgg 


5 


51701 


gcgctcgccc 


ctcctggagg 




51751 


ccgcagaccc 


cgccggtccc 




51801 


cacggcctgt 


actggctcgc 




51851 


cctcgtcctc 


gacgacgtcc 




51901 


tcgacttcct 


gctgcgacgc 


10 


51951 


gcctggcgca 


gcgaggccga 




52001 


cgccgcccag 


cgccgcccca 




52051 


acgacatagg 


cgaaatggtg 




52101 


tcgttcgtca 


gccgggtcgc 




52151 


cgcccgcctc 


ctcgacgaac 


15 


52201 


ccggggagcg 


ccgggccgcc 




52251 


gtgcgctgcc 


tgttggagcg 




52301 


tgccatcgcc 


gtactcggcc 




52351 


ccggcgtccc 


ggccgcgacc 




52401 


gccggcatcc 


tggccgccga 


20 


52451 


ctccgcggtg 


ctcgacgacg 




52501 


ccaacgccgc 


gctgttgctg 




52551 


gccggccagc 


tcatgctgct 




52601 


ggtgctgcgc 


gacgccgccg 




52651 


ccggtgtgcg 


ctgcctctac 


25 


52701 


gccgtccgca 


tccagatggc 




52751 


ggcgatgcgc 


ctcctcaagg 




52801 


cccgcgccca 


ggtcgccgtc 




52851 


gaatccccgt 


ccggggtgcg 




52901 


cgccgaactg 


ggccccgaac 


30 


52951 


tcgtggagtc 


cgtgctgctc 




53001 


ggtgcggtcc 


gcgaccgggc 




53051 


accggcccag 


cggcagatgc 




53101 


acggccggga 


cgcccggtcg 




53151 


gcacccggcg 


tcgagctgga 


35 


53201 


cctctccctg 


gccgacgagg 




53251 


tgctccagta 


cggccaggac 




53301 


ctctccacgc 


gcgccctgct 



cctgctccac gggcccgccg gcatgggcaa 
tcacggcgag cgatgtctgc cgcggcatga 
ggcgagaccg tcgccggcgc cgggtacggc 
cgggctcggc ctgagcggcg gcgacgcccg 
gcctggcggc ccgcgcgctg cccgcactca 
gacgccgcca cgggtgccta cccggtgctg 
cgcccgcctc atggcccaac ggccgctggt 
actggtgcga cgaacgctcc ctggcctgga 
gccgaggacc tgccgctgct ggtcgtcctg 
accggtcgcg cccgcggtgc tcgccgacat 
ccgtgctcgg cctgcacccg ctcggccccg 
cgtcgcgtct tccggaccac ggccgcaccg 
cgccgtgtcc ggcggcaatc cgctggccct 
tccgcgccga gggcgtccgg ccggacgccg 
gaggtcggca gtcacgtcct cgcccgctcg 
ccggccgccc tgggtgcgcg gcgtggcccg 
cggagtgcac cgagttgctg gcggcgctcg 
gtcgacgagg ccctgttggt gctgcgcagg 
ccgcgtggac ttcgtccatg acgtcgtccg 
tcgccccgcc caccctggcc gaactgcgca 
agcgacgccg gccgcccctc cgaggagctc 
gccggtgctc gaccagccgt ggatggccgc 
cccaggcgga gagccgcggc gccccggagg 
cgggtgttgg aggtggagcc ggacaacgtt 
ccgcgcgctc gccgagatca acccgcccga 
aagcgctctc cctcgccggg gacgtccgca 
cagtacggct tcacctgcct cgccgtgcag 
gatgctggag gacgcgctcg ccgagctgac 
cagggcccgt ggaccgggag ttgcggaccc 
atcgtcgggg ccgacgagaa ggtgacgatc 
ggcccggctc accatgccgc ccggcgacac 
tggccatgac caccgtgctg accgcgatgg 
gccgtcgacc aggcccgccg cgccctgcgc 
accctggtcg ctgctgtccg cctccttcgc 
tcgccgacgc gcagtacgca ctggacctca 
aacgcggcgg tgtggacgta cgtcctggcg 
ccaccacggg gtgggcgcct tccccgaggc 
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53351 


cctggccgac 


gcccagaccg 


ccgtcgagat 


cctcggcgag 


gagcgctggg 




53401 


cggacggcgc 


cgtgctgccc 


cgtgtcgcgc 


tggccaccgc 


cctcgtcgac 




53451 


cgcggcgagc 


ccgagcgcgc 


cgaacacgtc 


ctcgacggca 


tcacccgccc 




53501 


ccgcctcgaa 


cgcttcgtca 


tcgaatacca 


ctggtacctc 


caggcccgcg 


5 


53551 


cctacgcccg 


ctgggtccgc 


ggggatttcc 


aaggagccct 


ggacctcctt 




53601 


ctcgcctgcg 


gtcggtccct 


ggaggagtcg 


cgcttcagca 


acccggcgtt 




53651 


cgtgccgtgg 


tgggccgatg 


gcgcggtgct 


cctggcgacc 


ctggaccgcc 




53701 


acgaccaggc 


gcgcgaactc 


gccgcatacg 


gaagcgagtt 


ggccgagcgt 




53751 


tgggggacgg 


cgcgcggcct 


cggactggcc 


ttcatggccc 


agggcgtcgc 


10 


53801 


cgcacccggc 


cgcgccggca 


tcgatcacct 


caccgaggcg 


gtctcgctgc 




53851 


tcgcggactc 


cccggcccgg 


gccatggagg 


cccgggccga 


acttcttctc 




53901 


ggacacgccc 


acctgaagcg 


cgacgacctg 


cgggccgccc 


gggaacacct 




53951 


gcgcgccgcc 


gccgacctcg 


cccagcgctg 


cggcgccgtg 


aagctcggcg 




54001 


tcgacgccag 


aaaactgctg 


gtcaccgcgg 


gtggtcgggt 


acgcaggatg 


15 


54051 


accgcctccc 


cactcgacat 


gctgaccggg 


atggaacgca 


cggtggcgga 




54101 


cctcgcggtg 


accggcgcga 


gcaaccgggc 


catcgcggaa 


gccctcttcg 




54151 


tgactgtaag 


gaccatcgaa 


acccatctca 


cgagcgtcta 


ccggaagctc 




54201 


ggggtcggcg 


ggcgtgcgga 


gctgtccgcc 


gtcctggaga 


ccaggaccgc 




54251 


cacctccggt 


cggcagccgc 


cggcctgggt 


ctcccaggca 


cgcggacgcg 


20 


54301 


cttgagacga 


acaggacgag 


aggtagcggt 


gccccggagc 


aaggcgcgga 




54351 


accaaccgac 


cacctgcacg 


ccgcagtgcg 


cacccgacgc 


ccacggcgac 




54401 


cccaccatgc 


tcctcgaatg 


cggccgggaa 


cagcggctca 


tcggcgacct 




54451 


cctgcaccgc 


ctcggccagg 


ggcggccatc 


ggtgctcagc 


ctgaccggcc 




54501 


ggcccgggca 


cgcccagaac 


gccctggtcc 


gctggggcgc 


gtgccgggcc 


25 


54551 


aggcacgacg 


ggctgcgcgt 


cctgcgagcc 


caggcgacgc 


ccgcggaacg 




54601 


ggaactccgc 


tacggcgccg 


ttctccaact 


gctggccgtc 


ctcgacggcc 




54651 


cgcacggcag 


caccctggac 


gccgcgatcc 


gccacgacgg 


tcccccgcca 




54701 


ctgcccgtgc 


ccggcatcga 


ggaggtgctg 


cggcgcaccg 


gcacggcacc 




54751 


caccctggtc 


gtggtcgaag 


acgtccagtg 


gttggacccg 


gcctcgctga 


30 


54801 


cgtggttgca 


gatcctgctg 


cgccacctcg 


ggccggacac 


cccgctcgcc 




54851 


gtcctggcca 


gcagctgcgg 


tgacaccacg 


gccttcgaca 


ccgacccgaa 




54901 


ggcccccgcc 


gtcccggggc 


cgccggacac 


cgtgcccgtc 


gcgcgcttcg 




54951 


tggtgcccgc 


gctcaccgac 


cgcggggtcg 


ccgccaccgt 


ccgcgccgtc 




55001 


tgcggcaccc 


ccggcgacga 


ggagttcatc 


gccgcgctca 


cctccgccac 


35 


55051 


cgccggcaac 


cccgccatcc 


tgcgggacgc 


cctgcgcgcc 


ttcgtcgacc 




55101 


acggcctccc 


cgccgacgcc 


gaccacctcc 


cggagctgca 


cgccctcacc 




55151 


gctggcgtcg 


tcggcgacca 


caccgtgcgc 


gccctggacg 


gcctgcccgc 
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53351 cctggccgac gcccagaccg ccgtcgagat cctcggcgag gagcgctggg 

53401 cggacggcgc cgtgctgccc cgtgtcgcgc tggccaccgc cctcgtcgac 

534 51 cgcggcgagc ccgagcgcgc cgaacacgtc ctcgacggca tcacccgccc 

53501 ccgcctcgaa cgcttcgtca tcgaatacca ctggtacctc caggcccgcg 

5 53551 cctacgcccg ctgggtccgc ggggatttcc aaggagccct ggacctcctt 

53601 ctcgcctgcg gtcggtccct ggaggagtcg cgcttcagca acccggcgtt 

53651 cgtgccgtgg tgggccgatg gcgcggtgct cctggcgacc ctggaccgcc 

53701 acgaccaggc gcgcgaactc gccgcatacg gaagcgagtt ggccgagcgt 

53751 tgggggacgg cgcgcggcct cggactggcc ttcatggccc agggcgtcgc 

10 53 801 cgcacccggc cgcgccggca tcgatcacct caccgaggcg gtctcgctgc 

53851 tcgcggactc cccggcccgg gccatggagg cccgggccga acttcttctc 

53 901 ggacacgccc acctgaagcg cgacgacctg cgggccgccc gggaacacct 
53951 gcgcgccgcc gccgacctcg cccagcgctg cggcgccgtg aagctcggcg 
54001 tcgacgccag aaaactgctg gtcaccgcgg gtggtcgggt acgcaggatg 

15 54051 accgcctccc cactcgacat gctgaccggg atggaacgca cggtggcgga 

54101 cctcgcggtg accggcgcga gcaaccgggc catcgcggaa gccctcttcg 

54151 tgactgtaag gaccatcgaa acccatctca cgagcgtcta ccggaagctc 

54201 ggggtcggcg ggcgtgcgga gctgtccgcc gtcctggaga ccaggaccgc 

54251 cacctccggt cggcagccgc cggcctgggt ctcccaggca cgcggacgcg 

2 0 54 3 01 cttgagacga acaggacgag aggtagcggt gccccggagc aaggcgcgga 

54351 accaaccgac cacctgcacg ccgcagtgcg cacccgacgc ccacggcgac 

54401 cccaccatgc tcctcgaatg cggccgggaa cagcggctca tcggcgacct 

54451 cctgcaccgc ctcggccagg ggcggccatc ggtgctcagc ctgaccggcc 

54501 ggcccgggca cgcccagaac gccctggtcc gctggggcgc gtgccgggcc 

25 54551 aggcacgacg ggctgcgcgt cctgcgagcc caggcgacgc ccgcggaacg 

54601 ggaactccgc tacggcgccg ttctccaact gctggccgtc ctcgacggcc 

54651 cgcacggcag caccctggac gccgcgatcc gccacgacgg tcccccgcca 

54701 ctgcccgtgc ccggcatcga ggaggtgctg cggcgcaccg gcacggcacc 

54751 caccctggtc gtggtcgaag acgtccagtg gttggacccg gcctcgctga 

30 54801 cgtggttgca gatcctgctg cgccacctcg ggccggacac cccgctcgcc 

54851 gtcctggcca gcagctgcgg tgacaccacg gccttcgaca ccgacccgaa 

54 901 ggcccccgcc gtcccggggc cgccggacac cgtgcccgtc gcgcgcttcg 
54951 tggtgcccgc gctcaccgac cgcggggtcg ccgccaccgt ccgcgccgtc 
55001 tgcggcaccc ccggcgacga ggagttcatc gccgcgctca cctccgccac 

35 5 5051 cgccggcaac cccgccatcc tgcgggacgc cctgcgcgcc ttcgtcgacc 

55101 acggcctccc cgccgacgcc gaccacctcc cggagctgca cgccctcacc 

55151 gctggcgtcg tcggcgacca caccgtgcgc gccctggacg gcctgcccgc 
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55201 


cgaagtcaac 


gccgtcctgc 


gggccctggc 


cgtctgcggc 


gacctgctcg 




55251 


acttccaccg 


agtccgggcc 


ctcgccggcg 


cgcactcgct 


gtccgaggac 




55301 


cggatccgca 


ccctgctggc 


gagcgtcggc 


ctgaccgtgt 


ccgtcggcga 




55351 


caaggtgcac 


atccgcttcc 


ccgcctccaa 


ggcacgcgtc 


atcgaggaca 


5 


55401 


tgcccgccgc 


ggagcgcgcc 


gatctgtacg 


tccgcgcggc 


cgaactcacc 




55451 


cacagttgcg 


gcgtcaacga 


cgaggacgtc 


gcccatctgc 


tgctgcgctc 




55501 


gtcgccgctc 


ggcgcaccct 


gggtcgtgcc 


cctgctccgc 


cgcggattcg 




55551 


ccgccgcgct 


gcgccgggag 


gaccaccacc 


gggcctgtgc 


ctgcctctcc 




55601 


cgcgccctgc 


aggaacccct 


cgacccccgg 


gaacgcagcc 


tgctgacgtt 


10 


55651 


ggaactggcc 


gcggccgaag 


ccgtcgcccg 


gccggaggcg 


ggggatcgac 




55701 


gcctggggga 


actcgtccgc 


agcaccgtcg 


cggacaccga 


ccccacgtcg 




55751 


tccggtgagg 


gggtgggggt 


ccgcgccatc 


gacctggggt 


tcgcccgggg 




55801 


caacagcgaa 


tgggtccgcc 


gcaccgcggg 


cgaggccctg 


ccgtacgccg 




55851 


ggccggccga 


ccgggaggaa 


ctggtcgcgc 


tgttctggtt 


ggccgccgtg 


15 


55901 


cgggacgacg 


acgcgccgat 


gatccccgtg 


gtgccccggt 


tgcccgaccg 




55951 


gccggtgccg 


ccggcccagg 


ccggcgcccg 


tgcctggcag 


ctggccacgg 




56001 


cgggggagga 


cgcggacaag 


gccaggaagc 


tcgcccggat 


cgccctcacc 




56051 


ggcggggtga 


acgagagcct 


gatgatgccg 


aaactggcgg 


cctgcgccgc 




56101 


gctgttcgcc 


accgacgaca 


acgacgaggc 


ggtgcacggc 


ctggacacca 


20 


56151 


tgctcaccgc 


cgcccgcagt 


gcccacctgc 


gcagcatggc 


cgcccgcatt 




56201 


ttcaacctac 


gggcccggat 


acacctgtgc 


gcggcccggc 


tggaggccgc 




56251 


cgaacgcgat 


ctggacagcg 


ccgagcgcgc 


cctgccgccg 


acgagttggc 




56301 


acccccgtgc 


gctgcccaac 


ctgatcgcca 


cccgcatcct 


cgtcagcatg 




56351 


gagacgggcc 


gcccggaccg 


ggcccgccga 


ctcgccgagg 


ccccggtccc 


25 


56401 


cgccggcggc 


gaggagggtg 


tgtggtggcc 


cgccctgctg 


ctcgcccgcg 




56451 


cccgtgtggc 


cgccgacgac 


ggtgactggg 


aggaggccct 


gcggctgtcg 




56501 


cgggagtgcg 


ggcgctggct 


ttttcgccgg 


cactgggcca 


acccggccat 




56551 


gctcagttgg 


cgcccgctgg 


ccgccgaggc 


gtgtctgaag 


ctcggcgacg 




56601 * 


tgacggaggc 


gcgccggctg 


cgggacgagg 


agctgttctt 


cgccgaccgc 


30 


56651 


tggggcaccg 


cgagcgcccg 


cgggatcgcc 


cgcctgacga 


cgcggcgact 




56701 


cttcgacgac 


gacggcgacc 


gggccgtccg 


gcggatccgc 


gaggccgccg 




56751 


ccctgctccg 


cgactcgccc 


gcccgcctgg 


cctacctgtg 


gagccggctg 




56801 


agccaggccg 


gtgccgagac 


ggcccacggc 


gacaccgccg 


cggccgcacg 




56851 


ctcctggcag 


gcggtcgccc 


ggatgaccgc 


cgcccacccc 


gccagccgcc 


35 


56901 


tcgccaccgc 


cgcccgcacc 


ctgaccgtcc 


cgtccgttcc 


ggtcgccacc 




56951 


gcgccgccca 


ccgccgtcgt 


cccacccgga 


tggcgcgacc 


tgtccgaggc 




57001 


ggagaaggac 


accgtgctgc 


tcgccgcccg 


cggccacggc 


aaccgccaga 
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57051 


tcgccgaaca 


actcgccgtc 


agcaggcgca 


ccgtggagct 


ccggctgagc 




57101 


aacgcctacc 


gcaagctgag 


gatcggcgga 


cgcaaggagc 


tgtacctgct 




57151 


cctggaggcg 


ctggaaggac 


cggtcgcgga 


tgcttcttga 


gcgggagaac 




57201 


gaactggccc 


ggatccgggc 


cgccctggac 


gccgcggaag 


cgggcgactc 


5 


57251 


ctcgctcctc 


ctgatcaacg 


gtcccctcgg 


cagcgggcgt 


tcggcgctgc 




57301 


tgcgccggat 


accggagctg 


gccggcgacg 


gcacccgcgt 


cctgcgggcc 




57351 


agcgccgcct 


ggcgggaacg 


cgacttcccc 


ttcgggatcg 


cccgccaact 




57401 


cttcgaccac 


ctgctgtccg 


gggcgggcgg 


cgcagggccg 


gccgaacgca 




57451 


ccgccggggc 


agagcacttc 


agccgactga 


tggacaccgg 


cgaccgccct 


10 


57501 


accgggaccg 


gccccgccct 


ggaggtctcc 


caggcagtgc 


tccagggcgc 




57551 


ccaggcgctg 


ctcgccgacg 


cgtccgcgga 


gcggcgcctg 


ctgatcctgg 




57601 


tcgacgacct 


ccagtgggcc 


gacggcccgt 


ccctgcgctg 


gctggcccac 




57651 


ctcacccggc 


ggctgcacgg 


cctgcgggcg 


ctgctggtgt 


gcacgctggc 




57701 


cgacggcgac 


caccggggca 


ggtaccccct 


ggtccgggag 


gtcgccggcg 


15 


57751 


ccgcgcacac 


cgtcctgcgc 


ctggcgccgc 


tgtcccggga 


cgccacccgc 




57801 


gtcctgctcg 


ccgggcccca 


gggccggccg 


ccgcaggacg 


cactggtgcg 




57851 


cgccgtgtac 


gaggcgtcca 


ggggcaaccc 


gctgttcctg 


accgccttcc 




57901 


ggagcgctct 


gcgcgccacc 


ggaaggccgc 


ccggcggcga 


ccacttcggc 




57951 


gccgtccggg 


agctgagccc 


gacggtgctg 


cgcgatcggc 


tcgcgggcca 


20 


58001 


tctgcggatc 


cagccgcagc 


cggtgcgcga 


ggtcgcggtg 


gcggtggccg 




58051 


cgctgggcga 


ccacagcgat 


ccggtgctgc 


tcgcccagct 


cgccggggtc 




58101 


gatgaaatcg 


gtttcgccgg 


tgcccgccgc 


gcgctggtgg 


acgccggcct 




58151 


gttggcccgg 


ggacgggacg 


tccgcttcgt 


ccacggcgtc 


gtccgcgatg 




58201 


cggtggactc 


cctgctcacc 


ctcgacgagc 


gggaacgctc 


gcacgacgac 


25 


58251 


gccgccgatc 


tgctgtaccg 


ctgcgggcgg 


ccggccgagc 


aggtcgccgg 




58301 


ccatctgctg 


gccgtggtcc 


acccgggccg 


gccctggtcg 


gaggcggtcc 




58351 


tgcgctccgc 


ggcccacaac 


gcgctgcgcg 


ccggccggcc 


cgccgacgcg 




58401 


gcccggtacc 


tgcgccgcgc 


cctgctgcac 


caccgcaccc 


aggacggctg 




58451 


ccgcgcccgc 


atcctggtcg 


atctggccac 


cgccgagcgc 


gccctcgacc 


30 


58501 


ccgatgcctg 


tgtacgccac 


gtcagccagg 


cggtcgcgct 


gctggacacc 




58551 


tcgcgggatc 


gggccgccgc 


cgtgttgcgc 


atcccgccgt 


ccctgctcgc 




58601 


cgcccccagc 


ccgtccgccg 


tcgagttggt 


gcggcaggcc 


gccgccgggc 




58651 


tcgacgaacc 


ggggcagcgg 


gacgaggagg 


gagccgacga 


actcgccctg 




58701 


cgcctggagg 


cgtggctgcg 


gcactccggc 


cacgagaacc 


ccgtcgagct 


35 


58751 


ggcgtcctcg 


gtggcgcggc 


tgcggcgcat 


gggggcacgg 


ccgccggtgg 




58801 


acagcgtcgc 


cgaacgcgag 


ttggtcgccg 


tgctgttgag 


tgcgggcgcg 




58851 


ctcagcggcc 


ggctcagcgc 


cgcggagatc 


gccgacaccg 


gcaaccgcat 
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58901 


cctcgaacgt 


gagccggcca 




58951 


tggtgatgct 


ctcgctgttc 




59001 


tggctggcca 


gcgaacagca 




59051 


cgatgtgctg 


ctgaccgccg 


5 


59101 


gcccggccgc 


cgcgcgggag 




59151 


ggcgactggt 


cggaacccgc 




59201 


actgcgcgac 


ccggccttga 




59251 


gccggccggc 


cggactggcg 




59301 


gccgtcgacg 


tgcacttcgg 


10 


59351 


ggcctgtggc 


cgacgcctgg 




59401 


tgccctggcg 


tccgtatgca 




59451 


gatgccgcgc 


tgcaactcgc 




59501 


gggtgcgacg 


acgaacctgg 




59551 


tgcaggacga 


gggactggat 


15 


59601 


gcttcgtcct 


acgcgacgga 




59651 


tcggctgccg 


ggtgggccgg 




59701 


ggatcgccgc 


cgcctgtggt 




59751 


ggtctgggca 


gtgccatcgt 




59801 


gcggcgagtg gcgtcgctgg 


20 


59851 


cgaccgaact 


cggtgtgagc 




59901 


gcctatcgca 


agctcggcgt 




59951 


cccgggtcgt 


tgacgcccgc 




60001 


ctgtgcctgt 


cagccctgtg 




60051 


gtcatgacag 


caatcctcaa 


25 


60101 


catccacccc 


atcaccaatg 




60151 


tgggactctg 


cggaccgtac 




60201 


tgcgtagatg 


tgtcgcgttg 




60251 


tttcacggaa 


atcgaaactc 




60301 


aattctggat 


tccgtttctc 


30 


60351 


aaaggatggc 


ggtctcgtga 




60401 


gtacgaccgg 


cggggtgata 




60451 


gtcggccccg 


ggctgatggc 




60501 


cgccaaccag 


gagttccgcc 




60551 


gcggtcgcag 


cttccgggac 


35 


60601 


atgcgccagt 


tctcccggct 




60651 


gcacgtggtc 


gccgtgggcg 




60701 


ccgcctccgc 


ggtcaccggc 



ccgccgccca tgcccacacc ccgctgccgc 
gtcgccgagt ccgtgcaggg cgtggcctcc 
cacccggcgc cggtacgcga ccggcgcgga 
agcgggcctt cgtcctggtg acgcagggcc 
cacgtcgaac gcgccctggt catggacgcc 
cgtcatgatg ttcgccgcgg tcgccttcga 
gcgaacgcat cctggaacgg atccgcgacc 
ctcaccgcca ccggtcagat gctccaggcc 
ccgggggcgg gacgccctgg acacgctgct 
agaccgtggg atggcgcaat tccgcactgc 
atcgggctgc accagcggct cggcgagacc 
cgaggacgag ctgaggtggg cccgggagtg 
gccgggccct gcgcctgaag ggctggctgt 
ctgctgcgtg agagcgtcga gatcctccgc 
actggcgcgc accctcgtcg tcctcgggcg 
aggccgaggc ggtgttgcgg gaggccgcgg 
gttccctggc tcgccgaacg ggccgaactc 
gccgccggtc gccaccctga cccccagcga 
tgagccgggg cctgaccaac caggccatcg 
tcccgggcgg tggagaagca cctcaccagc 
ctccggccgc cgcgagttgg tgaatgcgct 
ggccgtcccg ttccggacga tcacaatcct 
ggtgcggcgt cgttgtcgta cgcccgtccg 
caaaagctca aacggaagcg ggggcaggtg 
tctcgtatgt tacttttggc ccaccgtagt 
ctcgtcaagt acggaccacg gatttagggg 
acgtgccgag ttgtggctaa ctacgctccg 
ttcggcgtga atcgcggcga acttcgatcg 
agaaggctaa aagacgacgg gggttatccg 
ctatcactca cctcactgac tcgaatcacc 
tccgctcaga ccgcaccggc cggggagtcc 
gtccctcgac cgtgacctca ccatcaagca 
gccgcttcga cgattccgcc ggtgacgtct 
ctgatgcacc cgagcgtgca gcagccgctg 
gatcgagggc aagcggcacc gcttcgcctc 
cccaggacgc ggccttcgcg ggcaccctca 
aagaccccgg acatcgccgg gatcctggtg 
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60751 


ctcatggact 


cgtccggcgc 


cgccgacgcc 


gccgatgccg 


gcgtcgtgac 




60801 


cagccagaag 


aagttcctga 


ccgagatcga 


tgcgcgcatc 


ctcgaaggca 




60851 


tcgcggccgg 


gctgtcgacc 


attccgctcg 


cctcgcggct 


ctacctcagc 




60901 


cgccagggcg 


tggagtacca 


cgtcaccggg 


ctgctgcgga 


agttgcgggt 


5 


60951 


gcccaaccgc 


gccgcgctcg 


tctcgcgcgc 


ctactccatg 


ggcatcctga 




61001 


acgtcggcac 


ctggcccccg 


aaggtcgtgg 


acgact teat 


caagtgaege 




61051 


ccggccggcc 


tgtcccgcac 


cggccggcca 


ccgtcggacc 


gctgtcccgc 




61101 


acgcccgtgt 


cgccgacgcc 


cgcctctcac 


ggggcgatgc 


ccggcggcac 




61151 


gggcgtttcg 


tgccctggtc 


gatgtttatc 


cattctcacg 


ttacgaaccc 


10 


61201 


gcaatgctca 


tgcgcgcgat 


tccggccact 


tttcgttccg 


tegtaaccaa 




61251 


actgcccgtc 


atatcgtttc 


cactgaccgt 


ttctgtggtc 


cgcggtgtgc 




61301 


cacctgggcc 


ccactccgac 


gcgcccgcga 


egggtcgega 


gcggcctccg 




61351 


aatactcacg 


gtaacccccg 


ttcctcctgg 


tggagttccc 


cgcgcgtcac 




61401 


attcctccgt 


gcctcgtgcg 


gcgtggctcg 


cccgctttcc 


ccgccgccgc 


15 


61451 


gacagccatt 


gttacaacgg 


cacgacagag 


cgggggtggt 


cgacccgtcc 




61501 


gcggtgcggc 


gcagtcaagc 


ggaacctctg 


ctcgatgttc 


gtgccacccc 




61551 


gaactctgtg 


ttcttggtga 


ccgttttcac 


caatgccccg 


cccgttcggt 




61601 


ggcgaatccg 


ccgcccgttc 


ctggccacaa 


eggatcgett 


cggccgccca 




61651 


ccacggccgg 


cgcagccgag 


gcttgttgga 


atcaccccca 


cccagtcggg 


20 


61701 


gcgccgttgt 


ggcccggaca 


cgcaggagga 


acaccgtgga 


tgetgaggge 




61751 


cgccgcaggg 


acatgctgga 


gctgatccgg 


egcageggat 


ccgccgatgt 




61801 


cgtgcgtctc 


gccgaggagt 


tcgccgtcag 


caaggaaacg 


gtccgccggg 




61851 


acctgaacgt 


cctggagggc 


catgggctga 


tacggcggcg 


gcacggtggc 




61901 


gcctatccca 


tggtgcggcc 


gggctccgag 


gccgtcttcg 


tctcccggac 


25 


61951 


cgcacagccg 


atcccggagg 


agtcccggat 


cgccaccgcg 


gccgccgaac 




62001 


tcctcagcga 


ggccgagacg 


gtcttcatcg 


acgagggctt 


caccccgcaa 




62051 


ctcatcgccg 


acgccctgcc 


gcgcgaccgg 


ccgctgacca 


tcgtcaccgc 




62101 


ctcgctcccg 


gtggtcagcg 


ccttcgcgac 


gagcccacag 


gccaacgtgc 




62151 


tgctcctggg 


cggccgggtc 


cgccggggca 


cgacggccac 


cgtcgaccac 


30 


62201 


tgggccgtcc 


acatgctgtc 


cggcttcgtc 


atcgacctgg 


ccttcctcgg 




62251 


cgcggagggg 


atctcgcgca 


ggtacggcct 


gaccaccccc 


gacccggcgg 




62301 


tcgccgaggt 


caaggcccag 


gccatccgcg 


tcgcgcgccg 


cccggtcctc 




62351 


gccggggtgc 


acaccaagtt 


cggcacggcg 


agcttctgcc 


ggttcggaga 




62401 


ggtgggcgac 


ctggagacga 


tcgtcaccgg 


cgccggcctg 


cccgtcgccg 


35 


62451 


aggcccaccg 


ctaccacctc 


atgggcccca 


aggttttacg 


ggtgtgacgc 




62501 


cgccggggcg 


tcccgccgcc 


cgcccacggc 


ctggcgcgcg 


ccggtgcggg 




62551 


tcagccgcgt 


cgcgcgtccg 


tcggttcccg 


ccagagtgcg 


agcggtgtca 
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62601 


cccccacccg 


gtagccctcc 


tcgtcgagga 


gatggccgcc 


gtgcttcatg 




62651 


tgccagaggg 


cgaccgcgcg 


ttcggtgacg 


gtcaggtccg 


cggcccgttc 




62701 


ggcgagtcgg 


acctcgaagc 


gctgggcgtc 


gtcgagggag 


cgcacccagg 




62751 


ccaccaggtg 


caggttgtgc 


cggctgatga 


cgctcgcgca 


caggcggacc 


5 


62801 


tcgcgcatcc 


cggtgacccg 


gcgggtcacc 


tcgcgcagcc 


tcgcggccgg 




62851 


gacctgcccc 


cagaagctca 


ccgtgaccgg 


ccactcggac 


agcgggcggg 




62901 


cgacctcgca 


ccgggcgtgc 


aacatgtccg 


cggcgaagag 


ccgttggacg 




62951 


cgacggcgca 


cggtgtcggg 


gccggcgccg 


cactgctcgg 


ccagcgcgcg 




63001 


gtaggtggcc 


cggccgtcca 


cggacagcgc 


ggtgaccagc 


tgttggtcga 


10 


63051 


gctcgtcgag 


gacgaaggcg 


ggggtgtccg 


tgcggtgtct 


ggacgcgtcg 




63101 


gcggcgagcc 


gggcgacctg 


gtgccggccg 


agggcccgca 


ggcgccaccg 




63151 


gctgccctcg 


gtgtgcaccg 


ggccggccag 


gtgcgtccgg 


gcggcccgga 




63201 


cgccgtccaa 


cgcggcgagg 


tcgtgggtca 


cccaacggga 


gagcatggcc 




63251 


ggatcgcggg 


ccatgacgtt 


gagttggagg 


tcacggtccc 


cggtgacgtg 


15 


63301 


ggagagcgcc 


accacgtgcg 


gcgcggcggc 


caacgcccgc 


gcgacgtcga 




63351 


gcagtcggcc 


gggagcgcag 


tcgatctcga 


cgaacgccag 


gcacccctgg 




63401 


cccgacgcgg 


ccagcaccgg 


ggccggatgg 


cagctgatcc 


aggcggcgcc 




63451 


ggtctcgacc 


agccggttcc 


agcgcctggc 


caccgtcacg 


gcgtccagtc 




63501 


cgaggacgga 


gccgatccgg 


gtccagctgg 


cccgcggcgt 


gatctgcagc 


20 


63551 


gcgtgcacca 


gggcctgatc 


cacatggtcc 


agggaccggg 


gggtctgccc 




63601 


ggaatcctgc 


gccacgggcc 


acctccttgc 


gtgtttccgg 


cggattcggg 




63651 


ccgccggtcg 


gctcaacctt 


cagcctggac 


tcgggtacgg 


ccggaccgta 




63701 


ccaggcaacc 


cccggagcaa 


caggagtggg 


attcggcatg 


tcggcagtgg 




63751 


aacgcgccaa 


ggaaatgcaa 


cccgagttgg 


agcggctgcg 


ccggtcgctg 


25 


63801 


caccgggagc 


cggagttggg 


ccttgcgctg 


ccgcgcaccc 


aggagaaggt 




63851 


gctcgcggcg 


ctcgacgggc 


tgccgttgga 


gatcaccccg 


ggcaccggcc 




63901 


tcacctcggt 


gaccgccgtc 


ctgcgcggcg 


gccgccccgg 


cggcgcagtg 




63951 


ctgctgcgcg 


gcgacatgga 


cgcgctgccg 


gtcaccgagg 


agagcggtgt 




64001 


gccgtacgcc 


tcggagatcc 


ccggccggat 


gcacgcctgc 


ggacacgacc 


30 


64051 


tgcacaccgc 


cggactcgtc 


ggcgcggcca 


ggctgttggc 


cgaacgccgg 




64101 


gaacgcctgc 


acggcgacgt 


ggtgttcatg 


ttccagcccg 


gcgaggaggg 




64151 


ccacaacggg 


gccggcgcga 


tgatcgagga 


gggcgtgctg 


gacgcggcgg 




64201 


gcaagccgct 


cgacgcggcg 


tacgccctgc 


acgtggcgtc 


caaccaggtg 




64251 


cccggcggcg 


tggtcatcac 


ccgctccggc 


accatcacct 


cggcctccga 


35 


64301 


cctgctgacg 


gtgacggtgc 


ggggcgaggg 


cggccacggc 


tcgacgcccg 




64351 


ccacggccaa 


ggacccggtc 


ccggcggcct 


gcgagatggt 


caccgcggtg 




64401 


cagaactggg 


tgacccgcgc 


cttcgacatc 


ttcgacccgg 


tcgtggtcac 
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64451 


cgtcggcacc 


ttccacgccg 


ggacgaaggc 


gagtgtcatc 


cccgacaacc 


64501 


gccgagttcc 


aggccacggt 


gcgcagttat 


tccgaatccg 


cccgggaccg 


64551 


gctcgccgaa 


gggaactgcc 


ccgtctggtg 


cgggccatcg 


cggccgccca 


64601 


cggcctggaa 


gcggaggtcg 


actacctccg 


gcactatccg 


gtcaccgtca 


64651 


acgacacgga 


cgagaccgcc 


ttcgcggtgg 


ccaccgcccg 


cgacctgttc 


64701 


gcgcccggcg 


aggtgtggga 


gtcgccgata 


cccaccaacg 


gctccgagga 


64751 


cttcgcgtac 


gtcctccggc 


gcgtccccgg 


cgcgatgctg 


ctggtgggcg 


64801 


ccgcccccga 


ggggagcgac 


tggcagcacg 


cgccgatgaa 


ccactcgccg 


64851 


cgcgccgtct 


tcgacgaccg 


ggtgctgtac 


cggcaggcgg 


cgctgctgac 


64901 


cgaactcgcc 


gcgcgccggc 


tcgcggccac 


cgcgccggcc 


gagcccgccg 


64951 


tggcgggatg 


acaccgccgc 


cgcacgcgtg 


gggcccgcac 


gcgtgcggcg 


65001 


tacgggccct 


ggccggtctc 


cttcccggct 


acttgcgcgc 


ggggcactcc 


65051 


ttccacgcga 


agtggtagag ggtgttgatg 


ctgccgtcgg 


tggagtcatg 


65101 


gccatgtagc 


tggtggcctt 


ggggtcagcc 


ccgctgccgt 





15 

SEP ID NO. 2 



DMA sequence: Nvs2 



20 


1 


gaattcggat 


cgatgctgct 




51 


tgccgtgcgt 


gaatcggggc 




101 


gtgctgctcg 


tgctgaccgt 




151 


ccggtccttg 


cgccgcgacc 




201 


ccaaggggcg 


gtggcgcagg 


25 


251 


tgccccggcg 


tctgccaggg 




301 


gcccagctcg 


gccgcggcaa 




351 


cgtcgatgac 


gaccatgccg 




401 


acagttctgt 


atgacctatt 




451 


gtttcatccg 


aaggtggggc 


30 


501 


tgcggtgttc 


ccggtcgcgc 




551 


gtggtcagca 


gggcgggggt 




601 


ccggcgaccg 


cccagcaggg 




651 


cgccctgtgt 


cgggtcgatg 




701 


cgctgggtca 


ggggatacca 


35 


751 


gagcagccgg 


tcccagtgga 




801 


ggtgcggccg 


ttccgagggc 




851 


agcggaccgt 


tcggttcggc 



cgtggtgtgt gccgcgggag gtgccgacgg 
gtggtcgccg tcgcgatcgc ggtggtgtcg 
ggtggtggcg acggcctact tgccgtgagc 
tgggcagcgg atgcacctct tcggtcgacg 
tcgagatagg cgcggacgga gacgccgggc 
cgcggagggg ccgctggaac gccacagacg 
gggtggcggt ggcttcgtcg gcggcggtga 
ggttcgctga catgttcctc gtggatatgc 
tcgccccgtg gcgtaaggag tagccggcag 
gcgggagcgg cgagttgacc gtgaggtgcg 
cggggggtgg gggccggcag gtggaccgcc 
gctgtgccag cggccgggga agaccgtgac 
ggcccgggac caggagtcgg gcggtgaagg 
acgatctcgg cctcggagaa gtccagttcg 
cgccttgaag acgctctcct tggcgctgaa 
cgtccggacg atgggccgcc agggccacga 
agggcgatgg cgttcaggac gccggccggc 
gtcgatgctc accgcggccg acagctccgc 
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901 gggggagacg gcggcggcgc 

951 cgatgcccgg cggccactgc 

1001 cggtccgggt ggccgagccg 

1051 ggtggtgaac tcgcgctgcc 

5 1101 gttccgagga gagcagccgg 

1151 gcctcggtgg cgaccgtggc 

1201 ccggcgaacg gtaagaacag 

12 51 gacggggcgc cgtgggcggc 

1301 gagtgtgggt gcgggctgcg 

10 1351 aggagttgat cgtgggtgcc 

14 01 gacgatcagg tccgcgtcgc 

14 51 agctggtccg gcccgcccgc 

1501 tcggtccggg tgtccacgga 

1551 cggtctggcg aggaaggccc 

15 1601 cgctgacggt gcccgactcg 

1651 agggtgcgga tgaagcggtc 

17 01 gtccgcacgg caggcaccgg 

1751 tgccgccgaa cagccaggtg 

1801 aggtcgtcgc gggtcatcgt 

20 1851 gccggagtcc ggttcgtaga 

1901 tgccggcgcc ggtggggccg 

1951 gtcagcgaga ggttctcgat 

2001 cacgtcggtg aactcgacgc 

2051 ggagcgggtc cggggcctgc 

25 2101 tgggcggagg cgatgccgga 

2151 cacgatcggc tggctgaact 

22 01 caccgagggt cagggtgccg 

22 51 accagcacat agccgaggtt 

23 01 accggaggcg aactgcgcct 
3 0 23 51 aggcgtcgaa gcgctcctcg 

24 01 agcgcatgac cggtgcacac 

24 51 cgcggaccac tgcgcggcgt 
2501 ccgcgatcaa cgccgagacc 

25 51 gacggcgaga tcaccagcat 
35 2601 cgaggtgatc agctcggcga 

2651 cgatgtcgtt ggtggtgcgg 

2701 aagtggcgca gcggcagccg 



ggtagccggc gcagtgcgtc atgctgccga 
ggggcgccgc gccgattggg cagtatggcc 
gcgcagggcc cggcgggcga gatggcggac 
gggactccac ggcccgggcg atgacctcgc 
tcgccggggc gcggccggtc gccgtacgcc 
gggcaggatc agttcgatca ccgcatacct 
ggggttgctt cggggcacgg ttccgtcctt 
cgggtcagcc ggccgcggcg cccgcggtgg 
tgcagccggg cgtacaggcc ctgcgcgcac 
ctgctcgacg atgcggccgg cgtccatcac 
ggatcgtgga caggcggtgc gcgatcacga 
agggagttca tggcgcgttg gatcaggacc 
gctggtggcc tcgtccagca cgaggacggc 
gggccacggt cagcagctgc ttctcgccgg 
tcgtccagca cggtgtcgta gccctgcggc 
ggcacaggtc gcgcgggccg cctcctcgat 
gtgcgccgta cgcgatgttc tccgcgatgg 
tcctggagca ccagcccgaa gcgggaccgc 
cgcggtgtcg gtgccgtcca ggaggatgcg 
agcgcatcag gaggttgccg agggtggttt 
acgatcgcca cggtgctgcc cggttccacg 
gaggggcgtg tcgggggagt agcggaagga 
ggccctcggc gcgggcgggc gtgccgggcc 
tcgggggcgt cgagcagggt gaagacgcgc 
ctggagccgg cccgccaccg aggcgatctc 
ggcgggcgta gaggatgaac gcctgcacgt 
tttatgacct tccaggcgcc gatgacggcc 
ggcgacgaac atcatgaccg gttccatggc 
tggccgcagc ccggtagacc gcgtcgttgc 
gccgccgcgc gccggtcgaa gcccttgatc 
ctcctccaca tgggcgttga gggtgccgtt 
agtggggctg cgcgcgcttg ctgatccggg 
ggcacgctga gcagcatcac cacggccagc 
cagcaccagc atcgtcaaca gcgagaagat 
gggtctgctg gagggtctgt tggaggttgt 
ctgagcagct caccggccgg ctgccggtcg 
ggtcagcttc tcccgggcgt cgcggcgcag 
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2751 


ttcgtggatg 


gtgcgccaca 


ccgcggacgc 


caccagccgg 


ccctgcgcca 




2801 


gcatgaacag 


cgacgccacg 


acgtagagcg 


ccagcaggac 


cagcagcagc 




2851 


cggccgatcg 


cggcgaagtc 


gatcccgggc 


gccgggcccg 


ggacgccgcc 




2901 


gagcacgccg 


tcggcgatca 


gatcggtgac 


ccggccgagc 


agcagcgggc 


5 


2951 


cgaacgcgtt 


gagcacgatc 


ccgccgacgc 


ccatcgacac 


ggcgagtgcc 




3001 


acggagcggc 


ggtgcggacg 


cagcaggccg 


acgagccgac 


gtgccggccg 




3051 


gggcgcggtc 


cgctcctcct 


caaggtcgtc 


cggcgaggcc 


atgggcggct 




3101 


tcctcctcgg 


tcagctgcga 


gagcgcgatc 


tcgcggtagg 


tcgggctggt 




3151 


gcgcagcagc 


acgtcgtggg 


tgccctgggc 


gaccacgcgc 


ccgcggtcca 


10 


3201 


gcaccacgat 


ccggtccgcg 


tcgcggccgg 


cggagatccg 


ctgcgccacg 




3251 


gtgatcaccg 


tggcgcccgc 


ggtgtacggc 


accagcgcgg 


tccgcagcgc 




3301 


cgcatcggtg 


gcctggtcga 


gcgccgagaa 


acagtcgtcg 


aagagataga 




3351 


tctccggccg 


gcgcagcaac 


gcccgggcga 


gggacaggcg 


ttggcgctgg 




3401 


ccgccggaga 


cattgccgcc 


gccctgggtg 


atctccgcgt 


cgaggccgtc 


15 


3451 


cggcatccgc 


gccacgaagt 


cggccgcctg 


ggcgacccgc 


agcgcccccc 




3501 


acagctcctc 


gtcggtggcg 


tccggccgcc 


cgaagcgcag 


actgctcgcc 




3551 


acggtgccgg 


agaacaggta 


cggccgctgc 


ggcacgaacc 


cgacggcggc 




3601 


ggcgagcgtg 


gccgcggtca 


gctcgcggac 


gtcggtgccg 


ccgacccgca 




3651 


ccgcgccctc 


ggtggcgtcg 


gccagccgca 


gcaccaagtt 


caacagggtc 


20 


3701 


gtcttgccgc 


tgccggtgct 


gccgagcacg 


gcgatccgct 


cgcccggctc 




3751 


gacggtcagg 


tcgacgtccc 


gcagcacggg 


ctcctcggcg 


cccgggtagc 




3801 


ggtacccggc 


cgcgcacagt 


tcgatccggc 


cggcgggccc 


gcgcaccggc 




3851 


tgcggcgcgg 


ccggcggcgc 


cacgctcgac 


ccggtgtcca 


ggacctccgc 




3901 


gatccggccg 


gcacagaccc 


gggcccgcgg 


caccgacagg 


aacacgaagg 


25 


3951 


cgagcatcac 


gacggacatc 


aggatcagcg 


agagatagct 


caggagggcg 




4001 


ctgagcgagc 


cgatcggcat 


ccggcccgcg 


tcgatccggt 


gggagccggt 




4051 


ccacagcagg 


gctacggtga 


aaccgttcat 


cagcagcagc 


acgaccggca 




4101 


gcatcgccgc 


gatcagccga 


cccacccgcc 


gcgacaccac 


gaggaacgcg 




4151 


tcgttggtct 


gcgcgaaccg 


cgcgcgctcg 


tggtcgtcgc 


ggacgaagga 


30 


4201 


ccggaccacc 


cgcaccccgg 


tgatcgcctc 


gcgcagcagc 


cgccccagcc 




4251 


ggtccagggt 


cagctgcatc 


cgcgcgtaca 


gggtgcccat 


ccgggccagc 




4301 


agcaggccga 


agcagaccgc 


caccaccagc 


accagcgcca 


ccagcagcag 




4351 


tgccagcgga 


acgtcctggc 


gcagcgccag 


cagcacgctg 


cccaggcaca 




4401 


tcagcggcgc 


gcagacgacg 


atgccgaagc 


cggtctgggc 


gaggttctgc 


35 


4451 


acctgctgca 


cgtcgttcac 


cgaccgggtc 


agcagggagg 


gggtgccgaa 




4501 


ccggccgatc 


tcgcgggcgg 


agaagtccag 


gatgcggcgg 


aagagcgcgg 




4551 


accgcagatc 


gcggcccatc 


gccgtcgcgg 


tccgggcggc 


cagcgcggcc 
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4 6 01 gcaccgagcg cggcggcgat 

4651 acccagctcg gtgatgcgcc 

47 01 taagtgcggc gcccagcgtc 

4751 agttgaaggg cgacgagagc 

5 4801 tgcccgcaga agtctcaaca 

4851 gtgcggccgc gcggtggacg 

4 901 caccgcagtc cgctccgaaa 

4951 tttgacggcc cggtactcgt 

5001 cagcatcttc cgcggcttgc 

10 5051 gtattcgacg ccaggtcgtg 

5101 cgtgttcgag aagggtttgc 

5151 caccggccag gacggctcct 

52 01 accaggtgtg gggtctgatc 

5251 gtcagccgcc tcgcctccga 

15 53 01 ccagggcagc ctggtctccg 

5351 acaacctcgg cgccatctcg 

5401 ctggtcaccg aggtcaacgg 

5451 ccgcatggtc agcggactgt 

5501 agatccgctt ctaccaggcg 

20 5551 gagacgccgc agcgcgagac 

5601 cgcggcaaag gcgtacgggc 

5651 tcggcatgta cgcggtctcc 

5701 cgcggccagg aattcgtcac 

57 51 caagcaaggc ctccaggaca 

2 5 5801 gcgactgggg ctatgccggc 

5851 cagcaggacg ccggcgacga 

5901 ggtgcgcgac gcggttcgga 

5951 aggactacgt cgtcatcgac 

6001 gtgctgtgcg ccgacagcgc 

30 6051 ggacgtcgac ttccccaccc 

6101 cgcaggtttc ccgcgaaaac 

6151 tggtagcagt tctcaagctt 

62 01 cgacgacact cgccacatgg 

62 51 cgaatccgaa cgtaccgccg 

3 5 6301 cgtgcggccg tcggctgtcg 

6351 aaaaactccg ggattacctc 

6401 cggcggcgcg tccacaagct 



ctgcaccagc gccaccacgc ccatcaccac 
cgccgtcgcc gcgcaccacc ccctggtcga 
ggcagcagca aagtgcccag gatctggacg 
ggcggtggcc caggcgtagg ggcgcagctg 
gcacggagga acaccccggt tgacggcggc 
gagtcggccg gcccgccgac ccggtcattg 
tttcactagt gttgggggtg gcaacggact 
gaattcccta gaaagcccgg gatgcgttga 
gagcgtgcgg gtgtctatcg tccggactgc 
gccgagctca agcgtttgga cggtctcttt 
catgtccaaa cgagcgctga tcaccggaat 
atctcgcgga gcacctgctg tcccagggct 
cgcggccagg ccaatccccg caagtcccgg 
actcgacttc atcgacgggg acctgatgga 
ccgtcgacac cgtgcagccc gacgaggtct 
ttcgtgccga tgtcctggca gcaggccgag 
catgggcgtg ctgcgcatgc tggaagccat 
ccacctcccg cacggtcagc ccgcgcggcc 
tccagctcgg agatgttcgg caaggccgcc 
caccctcttc cacccgcgca gcccctacgg 
actacatcac ccgcaactac cgcgagtcct 
ggcatgctct tcaaccacga atccccgcgc 
ccgcaagatc agcctggcgg tcgcccgcat 
agctggcact cggcaacctc gacgcggtgc 
gactacgtcc gcgccatgca cctgatgctc 
ctacgtcatc ggcaccgggc agatgcactc 
tcgcgttcga acacgtcggc ctgaactggg 
cccgacctgg tgcggcccgc cgaggtcgag 
caaggcccag gaccgcctcg gctggaagcc 
tcatgcgcat gatggtcgat tccgacctgg 
-caatacggcg acgtgctgct cgccgccaac 
tcgaaaacta gtgaattcct gccggaattc 
attcccccgg tgagtggcga atccaggtgg 
aaacggcgtg gagaagtcgg acgccattca 
agatgggttg agttgagatg gacaacgaac 
aagcttgcga cggccgacct tcgacgcacc 
ggagtcggcg gcccaggaac cggtggccat 
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6451 


catcggcatg 


acctgtcgct 


accccggcgg 


cgtccgcagc 


cccgaagacc 




6501 


tctggcgcat 


ggtcgaggcc 


ggcgagcacg 


gcgtcacccc 


gttccccacc 




6551 


gaccgcggtt 


gggacctgga 


ggcgctggcc 


gccgcgccga 


ccgcctccgg 




6601 


cggattcctg 


cacgacgcac 


ccgacttcga 


cgcggacttc 


ttcggcatct 


5 


6651 


cgccgcgcga 


ggcggtcgcc 


atggacccgc 


aacagcgcgt 


cgtcctggaa 




6701 


tccgcctggg 


aggcgttcga 


acgcgccggc 


atcgacccga 


cgtccgtgaa 




6751 


gggcagccgc 


accggagtct 


tcatcggcgc 


gatggcccag 


gactaccggg 




6801 


tcggccccgc 


cgacggcgcc 


gagggcttcc 


aactcaccgg 


caacaccggc 




6851 


agcgtgctgt 


ccggccgcat 


ctcctacacc 


ttcggcacgg 


tcggccccgc 


10 


6901 


cgtcaccgtc 


gacaccgcct 


gctcctcctc 


cctcgtcgcc 


gtccacctcg 




6951 


ccacccaggc 


gctgcgggcc 


ggcgagtgca 


ccctcgccct 


cgccggcggc 




7001 


gtcaccatca 


tgtccggccc 


cggcaccttc 


atcgaaatgg 


gccgccaggg 




7051 


cgggctctcc 


gccgacggcc 


gctgccgctc 


cttcggcgac 


accgccgacg 




7101 


gcaccggctg 


ggccgaaggc 


gtcggcatcc 


tcgtcctgga 


acggctgtcc 


15 


7151 


gacgccgtcc 


gcaacggcca 


cgagatcctc 


gccgtcgtcc 


gcggcaccgc 




7201 


cgtcaaccag 


gacggcgcct 


ccaacggcct 


gaccgccccc 


aacggcccct 




7251 


cccagcagca 


ggtcatccag 


caggccctgg 


tcaacgcccg 


actcgccgcc 




7301 


ggggacatcg 


acgtcgtcga 


ggcgcacggc 


accggcacca 


ccctcggcga 




7351 


ccccgtcgag 


gcccaggccc 


tgctcgccac 


ctacgggcag 


aaccgcccgg 


20 


7401 


cggaccggcc 


gctgctgctg 


ggctcggtca 


agtccaacct 


cagccacacc 




7451 


caggccgccg 


ccggcgtcgc 


cggcgtgatc 


aagatggtca 


tggcgatgcg 




7501 


gcacggcacc 


ctgccgcgca 


ccctgcacgc 


cgaggagccc 


acccaccacg 




7551 


tcgactggtc 


gcagggcgcc 


gtgcggctgc 


tgaccgacac 


caccgactgg 




7601 


cccgccaccg 


gggcgccgcg 


ccgcgccgcc 


gtctcctcct 


tcggcatcag 


25 


7651 


cggcaccaac 


gcccacacca 


tcatcgagca 


ggcccccgaa 


ccgcagcccg 




7701 


aggacgccgc 


gaccgcgcag 


gacgacgccg 


ccggcagcac 


gccggccacc 




7751 


gcccccgtag 


tgcccggcgt 


cgtaccggtc 


ctgctctccg 


gccgcacccc 




7801 


ggacgccctg 


cgcggccagg 


ccgcggccct 


gcgcgccgcc 


ctcgacaccg 




7851 


gccggcggcc 


cgacctgctc 


gacctcgcac 


actccctcgc 


caccacccgc 


30 


7901 


gccgggttcg 


agcaccgcgc 


cgtcctcctc 


gccaccgacc 


accccgccct 




7951 


gaccgacggc 


ctcaccgccc 


tcgccgacgc 


cgacgacccg 


gccgccgccc 




8001 


ccgcctggat 


caccggcacc 


acccgggccg 


agacccggct 


cgccgtcctg 




8051 


ttcaccggcc 


agggcgccca 


acgcctcggc 


gcgggacggg 


aactcgccgc 




8101 


ccgtttcccg 


gcgttcgcca 


ccgccctcga 


cgcggcgctc 


gacgccttca 


35 


8151 


ccccgcacct 


cgaccgcccc 


ctgcgcgagg 


tcctgtgggg 


caccgacgcc 




8201 


gccctgctcg 


accgcaccgc 


atacgcccag 


ccggccctct 


tcgccgtcga 




8251 


agtggcgctc 


taccggctga 


tcgaatcgtt 


cggcgtccgc 


cccgaccacc 
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8301 


tcgccggcca 


ctccgtcggc 


gagatcgtcg 


ccgcgcacct 


cgccggggtc 




8351 


ctctccctgg 


ccgacgccgc 


caccctcgtc 


gccgcccgcg 


gtcgcctgat 




8401 


gcaggcgctg 


cccgacggcg 


gggcgatgat 


cgccgtccag 


gcgtcggaag 




8451 


ccgacgtcgc 


cccgctgctc 


gccgggcacg 


aggaccaggt 


cgcgatcgcc 


5 


8501 


gccgtcaacg 


gcccctccgc 


cgtcgtcctg 


tccggcgccg 


aagccaccgt 




8551 


caccgcgctc 


gccgaacagc 


tcgccgccga 


cggccgcaag 


acccgccggc 




8601 


tgcgcgtctc 


gcacgccttc 


cactcgccgc 


tcatggagcc 


gatgctcgac 




8651 


gccttccgcg 


ccgtcgtcga 


agacctcacg 


ctccagccgc 


cgctcctgcc 




8701 


ggtcgtctcc 


aacctgaccg 


gcaagcccgc 


caccgtcgcc 


caactcacct 


10 


8751 


ccgccgacta 


ctgggtcgac 


cacgtccggc 


acgccgtccg 


cttcgccgac 




8801 


ggcatcgact 


ggctcgcccg 


gcacgacacc 


accgccttcc 


tcgaactcgg 




8851 


ccccgacggc 


gtgctgtccg 


ccatggccca 


ggactgcctg 


gacgccgccg 




8 901 


acgcagacgc 


cgtcaccctc 


cccgccctgc 


gcgccgggcg 


ccccgaggag 




8951 


cacaccctca 


ccaccgccct 


cgccggtctg 


cacgtccacg 


gcgccaccct 


15 


9001 


ggactggacc 


ggctgcttcg 


ccggcaccgg 


cgcccgccgc 


accgacctgc 




9051 


cgacctacgc 


cttccagcgc 


cgccgctact 


ggcccaaggc 


cctccagagc 




9101 


ggcaccgccg 


acctgcgctc 


ggtcggcctc 


ggtgccgccc 


accacccgct 




9151 


gctctccgcc 


gccgtctccc 


tcgccgacgc 


aggcggcacc 


ctgctcaccg 




9201 


gccgcctctc 


ccggcagacc 


cacccctggc 


tcgccgacca 


caccgtccgc 


20 


9251 


ggcaccaccc 


tgctgcccgg 


taccgccttc 


ctcgaactcg 


ccgtccgcgc 




9301 


cggcgacgag 


gtcggctgcg 


accgcgtcga 


ggaactcacc 


ctcgccgcac 




9351 


cgctcctgct 


gcccgaacag 


ggcggcgtcc 


aggtccagtt 


gtggatcggc 




9401 


aaccccgacg 


tgtccggtcg 


ccgcaccgtc 


aacgtccacg 


cccgccccga 




9451 


caccggcgac 


gacaccccct 


ggaccgccca 


cgccaccggc 


gtcctcacca 


25 


9501 


ccgccgacgc 


ctcccgccag 


ctcccggctt 


cgtccgagca 


gggcggcacc 




9551 


cccctcgccg 


gcgaccccca 


ccccgccctc 


gacgcggccc 


agtggccccc 




9601 


ggccggcgcc 


gaaccgctgc 


cgctggacgg 


ccactacgac 


cgcctcgccg 




9651 


acggcggctt 


cggctacggc 


ccggtcttcc 


agggcctgcg 


cgccgcctgg 




9701 


cgcggcggcg 


acgtcgtcta 


cgccgaggtc 


gagctgcccg 


aggccggccg 


30 


9751 


gtccgacgcc 


gaggcgttcg 


gcctccaccc 


cgccctgctc 


gacgccgccc 




9801 


tgcacgccgc 


gcccttcacc 


ggcctcggcg 


aacgcggccg 


gggcggcctg 




9851 


ccgttctcct 


gggagggcgt 


ctccctccac 


gccggcggcg 


ccaccaccct 




9901 


ccgcgtccgc 


ctgaccccgg 


tcgccgacga 


cgcgctcgcc 


ctgaccgtcg 




9951 


ccgacggcac 


cggcgcgccc 


gtgctgtccg 


tcgactcgct 


cgtcctgcgc 


35 


10001 


agcgtggcga 


cccaacagct 


cgacacggcc 


gccgccgtcg 


cccgtgacgc 




10051 


cctcttccgc 


ctcgactgga 


cccccgtcca 


gccgaccgcc 


accgaccccg 




10101 


ggcccgtcgc 


cctcctcggc 


gccgacccct 


tcggcctgct 


cacccacgcc 
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10151 ggattcgccg acgccccggc 

10201 ggacggcccg gtcccgacca 

102 51 acgacgcggc cgacccggcc 

10301 ctcgccgccg tacagacctg 

5 10351 ccgcctggtc ttcgtgaccc 

10401 ctgctgcggt gtggggtctg 

10451 tgttttgctc tggtcgatct 

10501 gctcgtcgct gcgttggtca 

10551 atgtgttgcg ggtcgcgcgt 

10 10601 gcgggtgctg atggcaccgg 

10651 gttctcgggt gagggtgcgg 

10701 gtgcggtgtt ggcgcgtcat 

10751 ctgttggtca gtcgcagtgg 

10801 ggcggagctt gcgggtgtgg 

15 10851 tgaccgatcg tgccgcggtg 

10901 gcggtggttc atgcggctgg 

10951 gaccggggag cggttgtccg 

11001 ggcatctaca tgaggcgacc 

11051 ttctcctccc tcgccggggt 

2 0 11101 ggccgcgaac gccttcctgg 

11151 gactgcccgg cctgtcactc 

11201 atgacgggca ccctgacgga 

11251 tgtcccgccg ctctccgtgg 

11301 tggccgggac cgacgccacc 

25 11351 ctccgcgcac ggggtgaagt 

11401 ccgggcgcgc cgagccgccg 

11451 cccagcgcct gcgccgcctg 

11501 gacctcgtcc gcggtcaggt 

11551 cgacgtcgac gccggccgtg 

30 11601 ccgccgtcga actgcgcaac 

11651 cccgccaccc tggtcttcga 

11701 cgtcctggac gagttgttgg 

11751 cggccgcggt tgcggtggcg 

11801 tgccgctacc ccggtggcgt 

3 5 11851 caccgaaggc accgacgccg 

11901 acgtcgaatc cctctatcac 

11951 acgcgctcgg gtgggttcct 



atacccggac ctcgccgccc tcgccgcggc 
ccgtcgtgct gtccctcgcc ggcaccgggg 
cggtccgcac accgctgcgc cgcggaggcc 
gctcgaccac catgagcgct tcgccgccgc 
gtggtgcgac ggtcgggcgt gatgttgctg 
gtgcgttcgg cgcagtcgga gaatccgggg 
ggatccggat ggtgcggtgg gtgcggctgc 
gtggtgagcc gcagcttgcg gtgcgcggtg 
ctggtgcggc ggccgctcac cgaggtcggt 
ggatggcgtc ggggatggct ctggtgtgtc 
tcctggtcac tggtggtacg ggtggtctgg 
ctggtggccg agtatggggt gcgggatctg 
tgaacgtgcc gtgggtgctg gggagttggt 
gtgcgcgggt gcgggtggtt gcgtgtgatg 
gtggagttgg ttggcgggca tgcggtgtcc 
tgtgctggat gacggcatgg tgggtgcgtt 
cggtgctgcg gccgaaggtg gatgctgtct 
cgcggcctgg acctggacgc gttcgtcgtc 
cttcggcagt cccggccagg ccaactacgc 
acgcgctgat gacgcggcgc cgggcggagg 
gcatggggac cgtgggagca gtcgggcgga 
cgtcgacgcc gaacggctgg cccgctccgg 
cgcagggcct ggccctcttc gacgctgccg 
tgcgttccgg tccgcctgga cctccccgtc 
gccgccgctg ctgaggtcgt tgatccgcgt 
tcgccggctc cgccaccgcg ggcaacctcg 
gacgaggacg gccgcgacga gatggtcctg 
cgccctcgtc ctcggccacg cgaccggtgg 
ccttccgcga cctcggcttc gactcgctga 
cgcctcaaca ccgtcaccgg cctgcgcctg 
ctacccgacc gtccggcacc tcgccacgta 
gcacggatgc cgaggtggcg accgtgcagc 
gacgatccga tcgtcatcgt gggcatggcc 
cagctccccc gaggacctgt ggcgcgtgct 
tctcgggctt cccgaccaac cgtggttggg 
ccggaccctg accacccggg tacctcctac 
gcatgaggcg ggggagttcg atccggggtt 
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12001 


cttcgggatg 


agtccgcggg 


aggcgttggc 


gaccgattcc 


cagcagcggt 




12051 


tgttgttgga 


gtcgtcgtgg 


gaggcgatcg 


agcgggccgg 


tattgatccg 




12101 


gtgagtttgc 


ggggtagtcg 


gacgggtgtg 


ttcgcggggg 


tgatgtacag 




12151 


cgattacagc 


gcgatgttgg 


cgagtccgga 


gttcgagggt 


ttccagggca 


5 


12201 


gtgggagttc 


gccgagtttg 


gcgtcgggtc 


gggtggccta 


cacgttgggg 




12251 


ttggaaggcc 


cggcggtgac 


ggtggatacg 


gcgtgttcgt 


cgtcgttggt 




12301 


ggcgatgcac 


tgggcgatgc 


aggcgttgcg 


tagtggtgag 


tgtgggttgg 




12351 


cgttggccgg 


tggtgtgacg 


gtgatgtcga 


cgcctgcggt 


gtttgtggac 




12401 


tttgctcggc 


agcggggttt 


gtcgccggat 


ggccggtgca 


aggcgtttgc 


10 


12451 


ggatgcggcc 


gatggtgtgg 


gctggtccga 


gggcgtcggc 


gtgttggtcc 




12501 


tggagcggca 


gtcggacgcg 


gtgcgcaatg 


gtcacgagat 


tttggctgtg 




12551 


gtgcggggtt 


cggcggtcaa 


ccaggatggt 


gcgtccaatg 


gtttgacggc 




12601 


gcccaatggt 


ccgtcgcagc 


agcgggtgat 


ccggcaggcg 


ttggccagtg 




12651 


gtggtctgac 


ggccggtgac 


gtggatgtgg 


tggaggcgca 


tggtacgggt 


15 


12701 


acgacgctcg 


gtgatccgat 


cgaggcgcag 


gcgttgttgg 


cgacgtatgg 




12751 


gcgggatcgt 


gagcctgagc 


ggccgttgtt 


gttgggttcg 


gtgaagtcga 




12801 


atctggggca 


tacgcaggct 


gctgcgggtg 


tggcgggcgt 


tatcaagatg 




12851 


gtgttggcga 


tgcggcatgg 


tgtggtgccg 


cggacgttgc 


atgtggatgc 




12901 


gccttcttcg 


catgtggact 


ggtccgaggg 


tgcggtggag 


ctgctcagtg 


20 


12951 


agcaggcggc 


ctggccggag 


acgggtcggg 


tgcggcgggc 


gggtgtctcc 




13001 


tccttcggca 


tcagcggtac 


caatgtgcat 


gtcatcgtcg 


agcaggcgcc 




13051 


gggcgccaag 


gcgatcgccg 


cggccggtgc 


ggcgcggcgc 


acgccgggtg 




13101 


ccgtgccggt 


gctgctctcg 


gggcgtggcc 


ggagtgccct 


gcggggccag 




13151 


gccgcccgcc 


tgctcggaca 


cctccaggcc 


cgacccgacg 


ccgaactcgt 


25 


13201 


cgatgtcgca 


ctgtcgttgg 


cgaccacccg 


gtcccgcttc 


gagcagcggg 




13251 


ccgccgtcgt 


ggcgcaggac 


cgcgaccagc 


tgatcgcctc 


gctgggggcg 




13301 


ctggccgccg 


accgccccga 


ccccgccgtc 


gtcgagggcg 


aggccgccgg 




13351 


acgcggccgg 


accgcggtgc 


tgttcaccgg 


acagggcagc 


cagcgggccg 




13401 


ccatggggcg 


tgaactccac 


gaggtgcagc 


cggagttcgc 


cgcggcgttc 


30 


13451 


gacgcggtgt 


gtgccgtttt 


cgacccgctg 


ttggaccggc 


cgctgcgcga 




13501 


ggtggtgttc 


gccgaggacg 


gcagcgacga 


ggccgcactg 


ctggacgaga 




13551 


ccggttggac 


gcagccggct 


ctgttcgccg 


tcgaggtggc 


gctgttccgg 




13601 


ctggtggaga 


gttggggtgt 


ccgtccggac 


ttcgtggccg 


gccattccat 




13651 


cggtgagatc 


gcggcggcgc 


acgtcgccgg 


ggtgctgacg 


ttggaggacg 


35 


13701 


cctgccgtct 


ggtggccgcg 


cgggcgacgc 


tgatgcaggc 


gctgccgacc 




13751 


ggcggcgcga 


tgatcgcgat 


ccaggccacc 


gaggacgaga 


tcgcggcgca 




13801 


cctcgacgac 


acggtggcga 


tcgccgccgt 


caacgggccg 


cagtccgtgg 
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13851 


tgatctccgg 


tgacgaggag 


gccgccgaaa 


cgatcgccgc 


cacgttcgcc 




13901 


gaacgcgggc 


gcaagaccaa 


gcggctgcgg 


gtgagccatg 


ccttccactc 




13951 


gccgcggatg 


gacgggatgc 


tggacgcctt 


ccggatcgtc 


gccgaggggc 




14001 


tgacctaccg 


ggcgccgcgc 


atcccgctcg 


tctccgacct 


caccggccgg 


5 


14051 


cgcgccgacg 


atgcggaggt 


gtgcaccgcg 


gagtactggg 


tgcggcacgt 




14101 


ccgagaggcc 


gtgcggttcg 


ccgactgcgt 


gcggacgctg 


cgcgacgccg 




14151 


gggccaccac 


cttcctggaa 


ctgggctccg 


acggcctgct 


gaccgcgatg 




14201 


gccgaggaca 


ccctcggtga 


cgaccacgac 


gccgaactgg 


tgccgatgct 




14251 


gcgcgccggg 


cgcgccgagg 


aactggccgc 


ggccaccgcc 


ctggcccgcc 


10 


14301 


tccaggtgcg 


cggcgtggac 


gtggactggg 


cggcgtacct 


cgccggcacc 




14351 


ggcgcccgac 


gcaccgacct 


gccgacctac 


gccttccagc 


acgcgtacta 




14401 


ctggccgcag 


ctgccgaccc 


cggccgccgc 


cctcgccgcc 


gccgatcccg 




14451 


ccgaccagca 


gctgtgggcc 


gctgtggagc 


gcggcgacgc 


ccgcgaactc 




14501 


gccgacatcc 


tcggcctggg 


cgaacaggac 


ctcacgccgc 


tggactccct 


15 


14551 


gctgcccgcc 


ctcacctcgt 


ggcggcgcgg 


caaccaggag 


aagcacctcc 




14601 


tggacaccct 


gcgctaccgc 


gtggagtgga 


cacgactgag 


caagccgacc 




14651 


gccccggtcc 


tcgacggcac 


ctggctgctg 


gtcgcctccg 


acgccaccgc 




14701 


ggccgaccag 


ccagccctcc 


tcgacggcct 


ggccgacgcc 


ctcggctcgc 




14751 


acggcgcgcg 


ggtgcgtcgc 


ctgcttctgg 


acgactcctg 


cgcggaccgc 


20 


14801 


gcggtgctcg 


ccgaacgact 


ggcgcggacc 


gccgacgtgg 


acgccgcgac 




14851 


ccaggtgctg 


tccgtgctgc 


cgctcgacga 


gcgggacgcc 


gacgactgcc 




14901 


cgccgctcac 


ccgcggactg 


gcgctgaccg 


tcgcgctcgt 


ccaggccctc 




14951 


gccgacaccg 


gcgcccaggg 


ccggctgtgg 


accgccaccc 


gcggcgccgt 




15001 


ctccaccaac 


cccgccgacc 


cggtcaccca 


ccccgtccag 


gccgctgcct 


25 


15051 


ggggcctggg 


ccggggcgtc 


gccctggagc 


acccacggct 


gtggggcggc 




15101 


ctggtcgacc 


tgccgcaggt 


cttcgacgag 


cgggccggac 


agcggctcgc 




15151 


cgggatcctc 


gccgtcaagg 


acgcaccgga 


cggcgaggac 


caggtggcgc 




15201 


tgcgggccac 


cggagtctcc 


ggccgccggc 


tcgtccgcca 


caccgtcgaa 




15251 


gcgctgccca 


cggccgcgga 


gttcaccgcc 


accggcactg 


tcctgatcac 


30 


15301 


tggtggcacc 


ggtggcctgg 


gcgccgaggt 


cgcccggtgg 


ctggcccgcg 




15351 


ccggcgccca 


gcacctcgtc 


ctgaccagcc 


gccgcggccc 


ggacgcgccg 




15401 


ggcgccgccg 


aactccgggc 


cgaactggag 


ggctacgggc 


cgtcggtgtc 




15451 


cgtcgtcgcc 


tgcgacgtcg 


ccgaecggga 


cgcgctcgcc 


gccgtcctca 




15501 


ccgcactgcc 


cgaggaactg 


ccgctgaccg 


gtgtcgtgca 


caccgcaggc 


35 


15551 


gtcggccact 


acggcccgct 


ggacaccctg 


agcaccgccg 


agttcgccgg 




15601 


cctcaccgcc 


gccaagctcg 


ccggcgccgc 


ccacctcgac 


gccctgctcg 




15651 


ccgaccgcga 


actggacttc 


ttcgtcctct 


tcggctccat 


cgccggtgtc 
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15701 tggggcagtg gcaaccagag 

15751 cgcgctcgcc ctgcaccgcc 

15801 cctggggccc gtgggccgag 

15851 gagaccctgc gccgccaggg 

5 15901 gaccgagctg cgccgcgccg 

15951 ccgacgtgga ctggcagcgc 

16001 agcgccctga tcgccggcct 

16051 gcgcaccgag caggacgcca 

16101 gcgccctggc cgaacccgag 

10 16151 accgagtccg ccaccgtcct 

16201 gggccgcgcc ttccgcgacg 

16251 tccgcaagcg cctgggcgcc 

16301 gtcttcgact acccgacacc 

16351 gatcctcggc gcggtgctgg 

15 16401 ccgacgacga gccgatcgcc 

16451 ggcgtcagct ccccggaaca 

16501 cgcgatcagc gagttccccg 

16 551 tcgacccgga ccccgaccgg 

16601 ttcctccacg aggccgacga 

20 16651 ccgcgaggcg ctggtcatgg 

16701 cctgggagtc cttcgagcgc 

16751 accctgaccg gcaccttcgt 

16 801 cgcgggcgac ggcaccgagg 
16851 tgctctccgg ccgactgtcg 

25 16901 accgtcgaca ccgcctgctc 

16951 ccagtcgctg cgcaacggcg 

17 001 cgatcatgac gacgcccaac 
17 051 ctcgccaagg acggccgctg 
17101 gacgctcgcc gagggcgtcg 

3 0 17151 cgcagcgcaa cggccacccg 

17201 aaccaggacg gcgcctccaa 

17251 gcagagggtc atccgccagg 

17301 acatcgacgc cctggaggcg 

17351 atcgaggccc aggcactgtt 

3 5 17401 gagcgcgctg ctgctcggct 

17451 ccgccgcggg catcgccagc 

17501 tccgaactgc cgccgaccct 



cgcctacggc gccgccaacg cctacctcga 
gcgcccgcgg cctcgccgcg acctccgtcg 
gccggcatgg ccgccgacga tgccgtttcc 
cctcggcctg ctcgacccgg ccccggccat 
tcgtccggca ggacgtcacc gtcaccgtcg 
tacgcaccgc tgttcacctc cgcccggccc 
gcccgaggtc cgcgccctcg ccgccgacga 
ccggcgcctc cgaggtcgtc acccgcgtcc 
caactgcgcc tgctgaccga cctcgtccgc 
cggccacagc tccgccgacg ccgtgcccga 
tcggcttcga ctcgctgacc gcggtcgagc 
gcgaccgggc tgtccctgcc cagcaccatg 
gctggaactc gcccagtacc tgcgggcgga 
aagtcgccgg cccggtcgcc accggcggcg 
atcatcggca tggcctgccg cttccccggc 
gctgtgggac ctggtcgcct ccggcaccga 
tcaaccgcgg ctggcagacc gggcacctct 
cccggcacca cctactccac ccagggcggc 
gttcgacccc accttcttcg gcatctcgcc 
acccgcagca gcggctcctg ctggagacca 
gccgggatcc gcccggaaac cctccgatcc 
cggctccagc taccaggagt acggcctggg 
gccacatggt caccggcagc agccccagtg 
tacgtcttcg gtctggaagg cccggcggtc 
gtcctcgctc gtggcgctgc acctggcctg 
agagcaacct ggccgtcgcc ggcggcgcca 
ccgttcatcg cgttcagccg gcagcgcgcc 
caaggcgttc tccgacgacg cggacggcat 
gcgtcgtcct cgtcgagcgg ctctccgacg 
gtcctggccg tcctccgcgg ctccgccatc 
cggcctgacc gcgcccaacg gcccgtccca 
ccctcgccaa cgcccgcctc gcgcccgggg 
cacggcaccg gcacaccgct cggcgacccc 
cgccacctac ggccgcgacc gcgaccccga 
cggtgaagtc caacatcggc cacacccagt 
gtcatcaaga tggtcatggc gctgcgccac 
gcacgccgac gcgccgtcct cgcacgtgga 
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17551 


ctggtcggcc 




17601 


agaccggtcg 




17651 


accaacgccc 




17701 


cgaggagcgg 


5 


17751 


cgtgggtggt 




17801 


cgcctcctcg 




17851 


accgctcgac 




17901 


accgtgccgt 




17951 


cgcgccgtcg 


10 


18001 


ggacgtcgag 




18051 


gggtggggat 




18101 


cggattgccg 




18151 


ggtcgatgtg 




18201 


atgtggtgca 


15 


18251 


tggcgttccc 




18301 


tgagatcgct 




18351 


cgcgggtggt 




18401 


cggggaggga 




18451 


gttggtcgag 


20 


18501 


gctccgtcgt 




18551 


cggctgaccg 




18601 


ctcgcactcg 




18651 


tggcggagct 




18701 


accggcgact 


25 


18751 


caacctgcgc 




18801 


cggcggagta 




18851 


atggcggtgc 




18901 


cggcaccctg 




18951 


ccgccgaggt 


30 


19001 


gaggggaccg 




19051 


cgaacacctg 




19101 


acccggacga 




19151 


gcgctgaccg 




19201 


gcccgccctg 


35 


19251 


acgcctggcg 




19301 


cacccgtccc 




19351 


cgacaccgat 



gggacggtcc ggctgctgac 
cccgcgccgg gccgcggtgt 
atgtcctgct ggagcaggcg 
cccgccgtgg cgccggtccc 
caccgcccgc agcgccgccg 
cgcacgccga aaccgtcgga 
atcggcctgt cgctggtctc 
cgtcgtcccg cccgcgggca 
cgacggacgg gccctcgccc 
ggtcggacgg tgttcgtgtt 
ggggtcccaa ctccttgatg 
a 9tgtgcggc ggcactcgcc 
ctgcggggtg tggtgggtgc 
gccggcgtcg ttcgcggtga 
gtggtgtgtt gccggatgcg 
gctgcggtgg tgtcgggtgc 
ggcgctgcgg agtcaggcca 
tgatgtccgt cgcgctgtcg 
ttcgaggggc gggtgtcggt 
ggtcgccggc gagcccgagg 
ccgacgacat ccgggcccgc 
caccaggtcg aggacctgca 
ggcgccgcgc acgtcggagg 
ggctggacac cgcgcggatg 
ggacgggtgc ggttcgcgga 
ccgcgcattc gtcgaggtca 
aggaggcgat cgacgaggcc 
cgccgcgacc agggcggcac 
cttcgtgcgc ggtgtggacg 
gtgcgtcccg gatcgacctg 
tgggccgtcc cgcccgcccc 
cgcggccttc tggaccgcgg 
ccgcgctcgg caccgacgag 
acctcctggc gccgggcccg 
ctaccgcgtc gcctggaaac 
tgaccggcac ctggctgctg 
gtggcagggg cgttggagac 



ccaggcgcgc gcctggccgg 
cctcgttcgg catcagcggt 
cccgtcgcgg acaccccggc 
gatcgccgcc ggcgtcgtcc 
ccctgcgcgg ccaggccgag 
accgccctgc cggccgccgg 
cgcgcgcgcc cgtttcgagc 
ccgacccgct ggccgcgctg 
gtggtcgccc gtggcgtggc 
ccccggtcag ggttcgcagt 
agtcggcggt gttcgcggag 
gagttcaccg actggtcgct 
gccgtcgttg gagcgggtcg 
tggtgtcgtt ggctgcgttg 
gtggtggggc attcgcaggg 
gctgtcgttg cgggacgggg 
ttggtcgtgc gttggcgggg 
gtggacgtgc tcgaaccgcg 
ggccgccgtc aacggcccgc 
cgctggacgc gctgcacgcc 
cggatcgcgg tggactacgc 
cgaggaactg ctggaggtgc 
tgccgttctt ctcgaccgtg 
gacgccggct actggttccg 
cgcggtggcg gacctgctgg 
gctcgcaccc ggtgctgtcg 
ggcgtgccgg ccgtcgccgc 
cgaccgcttc ctgctgtcgg 
tggactgggc ggggctgttc 
cccacctacg ccttccaaca 
ggaggccgtc gccgccgccg 
tcgaggacgg tgacgtctcc 
gactccgtcg ccgccgtgct 
ccgcgaccgc tccaccgtgg 
ccctcggcgg caccctgccg 
gtcaccgccg acggcatcga 
ctacggcgcc gaggtgcgcc 
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19401 


ggctggtcct 


ggacgaggag 


tgcgtcgacc 


gcgccgtcct 


gcgggagcgg 




19451 


ctggccggcg 


cggaggacgt 


gaccggcatc 


gtctccgtcc 


tcgccgccgc 




19501 


cgagcggacc 


gacgcggtac 


cgggcacctc 


cctggtgctc 


ggcaccgccc 




19551 


tgaccgtggc 


actgatccag 


gccctcggcg 


acgccgaaat 


cgacgctccc 


5 


19601 


gtatgggcgt 


tgacccgcgg 


cgcggtctcc 


accggccggg 


ccgacgagct 




19651 


gaccgcgccc 


gtccaggcac 


aggtcaccgg 


catcggctgg 


accgcggcgc 




19701 


tggagcaccc 


gcagcgctgg 


ggcggcaccc 


tcgacctgcc 


cgccgccctc 




19751 


gacgcccggg 


ccgcccagcg 


gctcgccgcc 


gtgctgtccg 


gcgccctcgg 




19801 


cagcgacgac 


cagctggcca 


tccggccctc 


cggggtcttc 


acccgccgca 


10 


19851 


tcgtgcgggc 


cgaggccacc 


gccgggcggc 


ccgccggcac 


ctggacgccg 




19901 


cgcggcacca 


cactggtcac 


cggcggctcc 


ggcaccctcg 


ccccgcacct 




19951 


cgcccgctgg 


ctggcccaac 


gcggcgccga 


gcacctggtc 


ctgatcagcc 




20001 


ggcgcggcac 


ggccgccccg 


ggcgccgccg 


aactcgtcgc 


ggaactggcc 




20051 


gagtcgggca 


ccgaggcgac 


cgtcgccgcc 


tgcgacatca 


ccgaccgcga 


15 


20101 


cgcggtcgcc 


gcgctgctgg 


ccgacctcaa 


ggccgacggg 


cgcaccgtcc 




20151 


gcaccgtcgt 


gcacaccgcc 


gccaccatcg 


agctgcacac 


cctggacgcc 




20201 


accaccctgg 


cggacttcga 


ccgggtgctg 


cacgccaagg 


tcaccggcgc 




20251 


ccaggtcctc 


gccgaactgc 


tcgacgacga 


agagctggac 


gacttcgtcc 




20301 


tgtactcctc 


caccgccggc 


atgtggggca 


gcggcgccca 


cgccgcctac 


20 


20351 


gtcgccggca 


acgcctacct 


cgccgcgctc 


gccgagcacc 


gccgggccaa 




20401 


cggactgccc 


gccctgtcgc 


tgtcctgggg 


catctgggcc 


gacgacctca 




20451 


aactgggccg 


ggtcgatccc 


cagatgatcc 


ggcgcagcgg 


cctggagttc 




20501 


atggacccgc 


agctggccct 


gagcggcctg 


cagcgggcgc 


tggacgacaa 




20551 


cgagaacgtg 


ctcgcggtcg 


ccgacgtgga 


ctgggagacc 


taccaccccg 


25 


20601 


tctacacctc 


cggccgaccc 


accccgctct 


tcgacgaggt 


gccggaggtc 




20651 


cgccggctca 


ccgcggccgc 


cgagcagagc 


gccgggaccg 


tcgccgaggg 




207O1 


cgagttcgcc 


gccgcgctgc 


gcgccctgtc 


cgacgccgag 


cagcagcgca 




20751 


ccctgctgga 


gaccgtccgc 


accgaggcgg 


cgtccgtcct 


cgggctgtcc 




20801 


tccgccgagg 


acctcaccga 


ccagcgggcc 


ttccgcgacg 


tcggcttcga 


30 


20851 


ctcgctgacc 


gccgtcggcc 


tgcgcaaccg 


gctcgcctcc 


gtcaccggcc 




209O1 


tgacgctgcc 


ctcgacgatg 


gtcttcgact 


accccaaccc 


ggccgcgctc 




20951 


gccgcctatc 


tgcacggcga 


gctggccggc 


gcccggtccg 


ccgccgccgg 




21001 


cgccgccgcc 


gtcccgaccg 


gcgcccccga 


cgccgacgac 


ccgatcgcga 




21051 


tcgtcggcat 


gagctgccgc 


taccccggcg 


gggtcggctc 


cgccgaggac 


35 . 


21101 


ctgtggcgga 


tcgccctgga 


cgaggtcgac 


gcgatctccg 


gcttccccgc 




21151 


cgaccgcggc 


tgggacgccg 


agggcctcta 


cgacccggac 


cccgaccggc 




21201 


ccggccgcac 


ctactccgtc 


cagggcggat 


tcctgcgcga 


cgtcgccgag 
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21251 


ttcgacccgg 


gcttcttcgg 


gatctcgccg 


cgcgaggcgc 


tgtcgatgga 




21301 


cccgcagcag 


cggctcctgc 


tggagaccgc 


ctgggaggcg 


ttcgagcacg 




21351 


ccggcatcga 


cccggtcggc 


cagcgcggca 


gccgcaccgg 


caccttcgtc 




21401 


ggcgccagct 


accaggacta 


cgcctccggc 


gtgcccaaca 


gegagggetc 


5 


21451 


cgaaggccac 


atgatcaccg 


gcacgctctc 


cagtgtgctg 


tccggccggg 




21501 


tgtcctacct 


cttcggcttc 


gagggccccg 


ccgtcacgct 


cgacaccgcc 




21551 


tgctcctcct 


ccctggtcgc 


catgcacctg 


gcctgccagt 


ccctgcgcaa 




21601 


cggggagagc 


tcgctggccc 


tggccggcgg 


cgtcagcatc 


atgtccaccc 




21651 


cgatgtcgtt 


cgtcggcttc 


agccggcagc 


gcgccctcgc 


egaggaegge 


10 


21701 


cgctgcaagg 


cgtacgcgga 


cggcgccgac 


gggatgaccc 


tegecgaggg 




21751 


cgtcggcctg 


gtgctgctgg 


agcggctgtc 


cgacgcccgc 


gccaacgggc 




21801 


accaggtgct 


cgccgtgatc 


cgcggctccg 


ccgtcaacca 


ggacggcgcc 




21851 


tccaacggcc 


tgaccgcacc 


caacggcccg 


tec cage age 


gcgtcatccg 




21901 


ccaggcgctg 


gccaactccg 


cggtggcgcc 


cggcgacatc 


gaegtcctgg 


15 


21951 


agggccacgg 


caccggtacc 


gccctcggcg 


accccatcga 


ggcgcaggcc 




22001 


ctgctcgcca 


cctacggcca 


ggaccgcgcc 


cccgaacggc 


cgctgctgct 




22051 


cggctcggtg 


aagtccaaca 


tcggccacac 


ccagatggca 


tccggcgtcg 




22101 


ccagcgtcat 


caagctcgtc 


cgcgccctcc 


aggaaggegt 


ggtgcccaag 




22151 


tccctgcaca 


tcgaccggcc 


ctccacccac 


gtcgactggt 


cctcgggcgc 


20 


22201 


catcgggctg 


ctcaccgaac 


gcaccccgtg 


gcccgagacc 


ggccggccgc 




22251 


gccgcgccgc 


cgtctcctcc 


ttcggcatca 


gcggcaccaa 


cgtccacacc 




22301 


atcctcgaac 


aggcccccgc 


ggacgaggcg 


cccacgcccg 


ccgacccgcc 




22351 


gcgggacggc 


ctggtgccgg 


tcctgctctc 


cggccgcggc 


gaggccgcgc 




22401 


tgcgcgccca 


ggccgcccgc 


ctgctcgcct 


tegtcgagga 


gcggcccgag 


25 


22451 


gcccacctca 


ccgacctcgc 


ccactccctc 


gccacctcgc 


gcgccgcgct 




22501 


ggaacgccgc 


gccgccgtca 


tcgccgccga 


ccgcgacacc 


ctgacccgcg 




22551 


gcctgcgcgc 


cctgtccgac 


ggccggcccg 


accccggcct 


ggtccagggc 




22601 


accgcgggac 


gcggccggac 


cgccttcctg 


ttcaccggac 


agggcageca 




22651 


gcgccccggc 


atgggccgcg 


aactccacga 


ccgctacccg 


gtgttcgccg 


30 


22701 


acgcgctgga 


cgaggtgctg 


gcccggctcg 


acgacggacc 


ggaccggccg 




22751 


ctgcgcgagg 


tgctgttcgc 


cgcgcccgac 


tccgccgagg 


ccgcgctcct 




22801 


ggaccggacc 


ggctacgccc 


agcccgcgct 


gttcgccgtc 


gaggtcgege 




22851 


tgttccgcct 


gctgacgtcc 


tggggcctga 


ccccggacta 


cctggccggc 




22901 


cactccgtcg 


gcgaactcgc 


cgccgcgcac 


gtcgccggcg 


tgctgtcgct 


35 


22951 


ggacgacgcc 


tgcactctgg 


tcgccgcccg 


cggccggctc 


atgeaggege 




23001 


tgcccgaggg 


cggcgcgatg 


gtcgccctgg 


aggecgegga 


ggacgaggtc 




23051 


ctgccgctcc 


tggaggggct 


caccgaccgg 


gtgtccgtcg 


ccgccgtcaa 
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23101 


cgggccgcgg 


tccgtggtcg 




23151 


tcgccgacct 


cttcgccgcc 




23201 


agccatgcct 


tccactcgcc 




23251 


cgccgtcgcc 


cgcgggctga 


5 


23301 


cgaacgtcag 


cggcggcctg 




23351 


tactgggtcg 


ggcacgtccg 




23401 


ctggctcgcc 


acccagggcg 




23451 


acggcgtgct 


cagcgccatg 




23501 


acggcactgc 


tgccgaccct 


10 


23551 


ggtcaccgcg 


gtcgccgcgg 




23601 


gcgggtactt 


cgccgaccac 




23651 


gcgttccaac 


gcgagcggta 




23701 


ccacacgccc 


ggatccgccc 




23751 


gggacgacgt 


cgccgccctc 


15 


23801 


gtcaccgcga 


tggtccccgc 




23851 


gcagaccgaa 


ctggactcct 




23901 


gcggcgccac 


cgcacccgcc 




23951 


ccgcacgacc 


accaggaccg 




24001 


cgacgtcgag 


accgccctgg 


20 


24051 


ccaccgaccg 


cgccgcgctg 




24101 


cagggcccgt 


tcagcggtgt 




24151 


cgccggccac 


cccggtgcgc 




24201 


tccaggccct 


cggcgacgcc 




24251 


cgcggggccg 


tggccgtcgg 


25 


24301 


ggccgccgtc 


tggggcctgg 




24351 


ggttcggcgg 


caccctcgac 




24401 


cgccggttgc 


gcgcggtgct 




24451 


cctgcggccc 


tccggcgtct 




24501 


gccccgacac 


cgcccgcacc 


30 


24551 


atcaccggcg 


gcaccggcgg 




24601 


ccgcgacggc 


gccacccacc 




24651 


cccccggcgc 


cgacgcgctc 




24701 


gtcaccctcg 


ccgcctgcga 




24751 


cctcgccgaa 


ctgcccgacg 


35 


24801 


ccggcgtcgt 


cgaggaccac 




24851 


gccgcggtgc 


tgcgcgccaa 




24901 


gaccgccgac 


ctggacctcg 
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tcgccggcgt cgaggaggac gtgctcctcc 
gacgggcgcc ggaccaagcg gctgcgggtg 
gctgatggac gccatgctcg acgacttcgc 
cctaccaccc gccgacgatc ccgttcgtgt 
gccaccgccg aacaggtccg cacgcccgac 
cgccgcggtc cgcttcgccg acggcatcga 
acgtccacac cttcctggag ctcggcccgg 
gcccgggaga gcctcaccga cccgtcccgc 
gcgcggcgac cggcccgagg aacctgccct 
cccacgcgca cggcgcccgc gtcgactgga 
ggcgcgcgcc ggaccacgct gccgacctac 
ctggcccgac accaccgccg ccacgagcgc 
tcgacgccga gttctgggcc gccgtcgagc 
gccgcctccc tggacctgga cgacgccacc 
gctcaccgcc tggcgccggc gccgcggcga 
ggcgctaccg cgtcacctgg aagccgcgcg 
gccctcaccg gccgctggct cgtgctcgtc 
tcaggacgac gcgaccgcgg cctgggcagc 
gcaccaccac cgtccggctg acggtcacca 
gccgcccgga tcaccgaagc cgccggcgac 
gctgtccctg ctgccgctcg ccaccggcga 
ccgccgccct caccctcacc accaccgccg 
ggcatcgacg cgccgctgtg gaacgtcacc 
ccgcgccgaa caggtcaccg cgcccgaaca 
gccgcgccgt cgccctggaa ctgccggccc 
ctgcccgcca ccctggacgg ccaggccgcc 
cgcggctacc gacggcgagg acgcggtggc 
tcctccgccg cctggcccac gccccggccg 
gccttcgacc cggcggccgg caccgtcctg 
catcggcggc cacgtcgccc gccgcctggc 
tgctgctcac cagccgccgc ggcccggccg 
cgcgccgaac tggaggaact gggcgcccgg 
cgccgccgac cgcgacgcgc tggccgcgct 
acgccccgct gtgcgcggtg ttccacaccg 
gtcgtggacg cgctcacacc ggagaacttc 
gaccgtcgcc gcccaccacc tgcacgagct 
ccgctttcgt gctgttctcc tccacggccg 
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24951 


gcgtcctcgg 


cgccgccgga 


cagggcaact 


acgccgcggc 


caacgcccac 




25001 


ctggacgccc 


tcgccgaaca 


ccgccgctcc 


cacggcctga 


ccgcgctgtc 




25051 


cgtcgcctgg 


ggcccgtggg 


ccggctccgg 


catggtcgcc 


gacgccgccg 




25101 


aactcaccga 


ccgggtacgg 


cgcggcggct 


tcgaaccgct 


cgcccccgaa 


5 


25151 


ccggccgtgc 


gcgccctgct 


gcgcgccatc 


gagaacgacg 


acaccaccgt 




25201 


cgcgctcgcc 


gacatcgact 


gggagcgctt 


ccagcgcgcc 


ttcgccgcgg 




25251 


tccgcccgct 


gccgttcgtc 


gccgacctcc 


ccgagaccgg 


ccgggccacc 




25301 


cccgcgaccg 


ccaccggcgc 


cgccaccggc 


ctgcggcagc 


aactcgccga 




25351 


actgccggag 


cacgagcgcc 


cggcagcggt 


cctggacctg 


ctgcgtaccc 


10 


25401 


aggtcgccgc 


cgtcctcggc 


cacgccgacc 


cgcgcaccgt 


cgaggacgac 




25451 


cacgccttcc 


gcgacctggg 


cttcgactcg 


ctgaccatcc 


tggaactgcg 




25501 


caacgccctc 


aacgccgcca 


ccggcctgag 


cctgcccgcc 


accctggtct 




25551 


acgacctgcc 


caccccgcgc 


gagatggcgg 


acttcctgct 


cgccgaactc 




25601 


ctcggcaccc 


tgcccaccga 


caccgccgcg 


accgtcgcca 


gcacggcctc 


15 


25651 


ccccaagctc 


tcagcttcgt 


tcgagcaggg 


cggtaccccc 


ttcgacgacc 




25701 


cgatcgccgt 


catcggcatc 


ggctgccgct 


tccccggcgg 


cgtcaccacc 




25751 


ccggaggagc 


tctggcagct 


cctcgacgag 


ggccgcgacg 


gcatcagccg 




25801 


cttccccgac 


gaccgcggct 


gggacctcgc 


cgcgctgggc 


gccggcgcct 




25851 


ccgacaccct 


ggagggcggc 


ttcctgaccg 


gcgtcgccga 


cttcgacgcc 


20 


25901 


cggttcttcg 


gcatctcgcc 


ccgcgaggcg 


ctggccatgg 


acccccagca 




25951 


gcggctgctg 


ctggagacca 


cctgggaggc 


gctggagcgg 


gccggcatcg 




26001 


acccgaccac 


gctgcgcggc 


tccaccaccg 


gcgtcttcgt 


cggcaccaac 




26051 


ggccaggact 


acccgacgct 


gttgcgccgc 


tccgcctcgg 


acgtggccgg 




26101 


ctacgtcgcc 


accggcaaca 


ccgccagcgt 


gatgtccggc 


cgcctgtcct 


25 


26151 


acgcgctcgg 


cctcgaaggc 


ccggccgtca 


ccatcgacac 


cgcctgctcc 




26201 


tcctcgctcg 


tcgccctgca 


ctgggccggc 


cgggcgctgc 


gcgccggcga 




26251 


gtgcgacctc 


gtggtggccg 


gcggcgtctc 


ggtcatggcc 


agcccggact 




26301 


ccttcgtcga 


gttctccacc 


cagggcggcc 


tggcacccga 


cgggcgctgc 




26351 


aaggcgttct 


ccgacgccgc 


cgacggcacc 


gcctggtccg 


aaggcgtcgg 


30 


26401 


catcctcgtc 


ctggaacgcc 


tctccgccgc 


ccgccgcaac 


ggccaccagg 




26451 


tcctcggcct 


gatccgcggc 


accgccgtca 


accaggacgg 


cgcgtccaac 




26501 


ggcctgaccg 


cgcccaacgg 


cctctcccag 


cagcgcgtca 


tcgcccaggc 




26551 


actcgccgac 


gcccgcctgc 


gccccgccga 


catcgacgcg 


atcgaggcgc 




26601 


acggcaccgg 


caccaccctc 


ggcgacccga 


tcgaggcccg 


cgccctgatc 


35 


26651 


accgcctacg 


gccgggaccg 


ggacgccgaa 


cggccgctgc 


tgctgggcac 




26701 


cgtcaagtcc 


aacatcgggc 


acacccaggc 


cgccgccggt 


gccgccggcg 




26751 


tcatcaagat 


gctgatggcg 


atgcgccacg 


gcaccctgcc 


caggacgctg 
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26801 


cacgtgggca 


ccccgtccag 




26851 


gctcctcgac 


gacgcgcggc 




26901 


ccggcgtctc 


cgccttcggc 




26951 


gagcaggccc 


cggaaaccga 


5 


27001 


gccggaggcc 


acgcccaccg 




27051 


gggaagcgct 


ccaggcgcag 




27101 


caccccgcgc 


gctcggcggc 




27151 


cacgctcttc 


ccgcaccgcg 




27201 


gcgaggccgc 


ccgcgccgcc 


10 


27251 


ctgttctccg 


gacagggcgc 




27301 


ccagcgcttc 


ccggtctacg 




27351 


tcgacaccgt 


gctggacgtc 




27401 


ggcacccccg 


aggccgcgct 




27451 


gctgttcgcc 


gtcgaggtcg 


15 


27501 


tgacgccgga 


cttcgtcgcc 



ccacgtcgac tggagcggcg gcaccgtcgc 
cctggccacg gaccgggcag ccgcggcgcg 
gtcagcggca ccaacgccca cgtcgtcgtc 
agcccccgcc gccccggccg ccgagccggc 
tcgtcccctg ggtcgtctcc ggacgcagcc 
ctggaccggc tcaccgcgca caccgccgcc 
ggacgtcggc cgctcgctgg ccaccgaccg 
ccgtgctgct cgccggcccg gacggggtgc 
gcgccccgca cccccggccg caccgcgttc 
ccagcacgcc ctgatgggcc acgacctgta 
ccgacgcact ggacaccgtc ctcgcccagt 
ccgctgcgcg ccgcgctgtt cgccgcgccg 
cctggaccag accggcttca cccagcccgc 
cactgttccg gctcgccgag tcctggcggc 
ggccactcca tcggcgagat c 



20 



SEP ID NO: 3 

NysD2 (incomplete - probably C-terminally truncated) 



1 msftypvsmp wlqgreldyv teavgggwis sqgpyvrrfe eafaayndvp 

51 fgvacssgtt altlalralg vgpgdevivp eftmiasawa vtytgatpvf 

101 vdcgddlnid vsrieekitp rtkvimpvhi ygrqcdmdav lnlayeynlr 

151 wedsaeahg vrprgdiacf slfankiisa geggvclthd phlaeqmahl 

25 201 ramaftkdhs flhkklaynf rmtnmqaava laqteqldti lalrrdiekr 

251 ydealrdipg itlmpprdvl wmydlraerr delcaylage gietrvffkp 

301 msrqpgyfsa dwpalnaarl sadgfylpth tgltaqeqef itgri 



SEP ID NO: 4 

30 

NysDl 



1 vtlpsgntrl gwrrrrmhsp gdragrvrga rarrpatfrg vlsmganrrp 
51 ilfvsyaesg llnpllvlag elsrrdvadl . wf atdekard evaawdgsp 
35 101 vrfaslgdtv sqmsavtwdd atyaevtqrs rfkahaavir hsfapesrma 

151 kyrrleeive evepalmvie smcqfgyela itkgipfvlg vpfvpsnvlt 
2 01 shvpfaksyt psgfpvphsg lpaamslaqr ienqlfrlrt lgmfltsdvr 
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251 kweednrvr telgiapqar qmmaridhae qvlcysvrel dypfpmhpkl 

301 rlvgtmvppl pqapdddgls dwlsaqksw ymgfgtitrl treqvaslve 

351 varrldgrgh qvlwklprgq qellppaael pdnlriegwv psqldvlahp 

401 nvkaffthag gngyheglyf gkplwrplw vdcddqairg qdfgvsltld 

5 451 rpetvdtedv ldkitrvldq psfteraehf agllrdaggr aaaadlllgl 

501 palatd 

SEP ID NO: 5 



10 NysA 





l 


mtigadedpv 


vwgmacryp 


ggvagpedlw 


elvrtgrdat 


tafpddrgwd 




51 


laalagdgpg 


rsatreggf 1 


tgaadfdaaf 


f gmspreavs 


tdpqqrlvle 




101 


tawealerag 


idphslrgsr 


tgvfvgasgq 


dyaavthasp 


ddldghaltg 


15 


151 


lapgvasgrl 


ayvlglegpa 


vtvdttssss 


lvalhwavra 


lragecstal 




201 


aggvtvmstp 


aafvghtrqg 


glapdgrckp 


f sddadgtaw 


aegvgiwle 




251 


hlstaraagn 


pvlavlrgsa 


vnqdgasdgl 


tapsgpaqer 


viraaladar 




301 


lapadidlve 


ahgtgtrlgd 


pvearallaa 


ygqdrdpdrp 


lrlgslkstl 




351 


ghaqaaagig 


gviktvltlr 


hglmprirhl 


atptrqvdws 


qgavapltdh 


20 


401 


tpwppadrpr 


ragvssf gis 


gtnahvilee 


appadvpvtr 


pgtlrpstvp 




451 


wpvsaatpea 


ldaqlarlra 


hlrthsdldp 


ldvgyslatg 


raalrhravl 




501 


lppadgtaad 


avehargaah 


qrrtavlf sg 


qgsqrpgmgr 


elaarfpvf a 




551 


dalddalral 


drhldgpvre 


vmwgtdaall 


drtgwtqpal 


f avevalhrl 




601 


vaslgvtpdf 


vgghsvgeia 


aahvagvlsl 


edacrlvaar 


atlmqalpag 


25 


651 


gamaaleate 


devapllgah 


lalaavngpt 


avwagaeda 


vrqltarf ad 




701 


rgrrtsrlav 


shaf hsplme 


pmldaf rdw 


srltf hqpsi 


plvsnltgel 




751 


agseitsaey 


wvrhvrdtvr 


fadgitalak 


agadvlielg 


pggvlsamar 




801 


dtlgpdsttd 


wpalskgrp 


eetafagalg 


rlhtlgypvd 


wpaf yagtga 




851 


rrvelptyaf 


qhvrhwptpp 


rpngagpgal 


ghpllgsave 


ladgggtvcs 


30 


901 


galslrthpw 


ladhtvagrv 


vlpatallel 


avragdeagc 


dvlhelhltt 




951 


ppalpddaal 


hvqvhvgpad 


ttgrravtvh 


trpdhhpagd 


wtrcatgtlg 




1001 


stppsaaeaa 


tggtpaawpp 


adaepldlad 


hyerladrgf 


dygptf rglr 




1051 


aawrrgaeif 


advecppgta 


ddapdhglhp 


alldaarhaa 


mavdgtvpva 




1101 


whgvrlhavg 


atalrvrirp 


tttgtltlta 


vdvhgapwt 


vealtarplt 


35 


1151 


deeraaprtp 


rqargetpad 


arparpaaar 


pgpageplpd 


ttgshptagh 




1201 


laalppaare 


rqlldlvrtq 


aaavlghpgp 


eavgtrsvfk 


elgf dslagv 




1251 


eladrltart 


glrlpatlvf 


nf ptperaah 


rlgellaata 


pldpgaygee 
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1301 Itrfeaivtn lpqdgperra vadrldaivs alrqnspaev pssdedidtv 
1351 svdrlldiid eefett 



SF.O TP NO: 6 
NysB 



1 mqepqqgqpd qqekivdylr 

51 crlpggvnsp eslwdlvrsg 

10 101 ggflydaaef daaffgispr 

151 rgsrsgvfvg syhwgapsad 

201 pavtvdtacs sslvalhlaa 

251 qgglapdgrc kafsdaadgt 

301 savnqdgasn gltapngpsq 

15 351 gdpieaqall atygqghtpd 

401 lrhghlpptl hadapsshvd 

451 isgtnahall eqaphpadta 

501 rdqaaalaar vetdpalrpq 

551 thelaagrsa nawegladv 

2 0 601 eriaecaaal aeftdwslvd 

651 Iwgsrgvlpd awghsqgei 

701 grggmmsval svdvleprlv 

751 arltaddira rriavdyash 

801 vtgdwldtar mdagywfrnl 

25 851 tmavldliee agvtavatgt 

901 fegtgaarvd lptyafqrer 

951 saltaalgtd edsvaailpg 

1001 ratltgtwll vttdgiddtd 

1051 lagaedvtgi vsvlaaaedd 

3 0 1101 wfltrgafat gpsdpvtrpl 

1151 araaqrlaaa lsgalgaedq 

1201 gttlitggsg tlapqlarwl 

1251 sgtevtvaac ditdrdavaa 

1301 tvaefadwh akvtgarild 

3 5 1351 agnaylsala eqrrarglrt 

1401 dpelaltalq hvldddetvi 

1451 rrldqasvpd pagpaadgla 



rvtsdlrrar rrigeleskd nepiaivgmg 
gdaisgfpvd rgwdletltg ngdgssathe 
eatamdpqqr lllevaweal eragiaptal 
aatelhghal tgtaasvlsg rlaytlgleg 
qslrvgessl aviggvtilt epsvfvefsa 
gwaegvgvlv aerlsdaqrn ghpvlavlrg 
erviqqalar tgltpadida veahgtgtrl 
qplwlgslks nightqaaag vagvikmvma 
wsagsvrllt egqqwpetgr prraavssfg 
dagddaapte pagapaalpw ivsghspqal 
dightlhtar allerravw apdraellaa 
egrtvfvfpg qgsqwvgmga qlldesavfa 
vlrgwgaps lervdwqpa sfavmvslaa 
aaawsgals lrdgarwal rsqaigrala 
efegrvsvaa vngprswva gepealdalh 
shqvedlhee llevlaelap rtsevpffst 
rgrvrfadav adllaaeyra fvevsshpvl 
lrrdqggagr fllsaaevfv rgvdvdwaga 
ywntrtaadr tpadapmdae fwaaveqadv 
ltswrrarsq rttldswryr vtwtplaqvp 
vagalesyga evrrlvldee ctdravlrer 
aarhpgltrg laltvslvqa Igdaeatapl 
qsqiagvgwt talehpqrwg gtvdlpdtld 
lavraagvla rrivraghra grpartwapr 
aergaehwl vsrrgadapg apeliaeaae 
lladltadgr tlrtvihaaa aielsaladt 
ellddaeldd fvlysstagm wgsgvhaayv 
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aqvlghadat evetgrqfqd lgfdsltave 
yptphaladh lrdellgtea esttavpvpt 
giaspedlwr lvsqgadatg pfptnrgwdl 
flhdagsfda dffgmsprea matdsqqrll 
sgtgvfagvm yndygttltg deyeafrgng 
vtvdtacsss lvalhwaaqa lragecslal 
glapdgrska faeaadgvaw segvgmlvle 
vnqdgasngl tapngpsqqr virqalasgg 
pieaqallat ygrdrdpenp lllgsiksni 
hgvlpqtlhv dapsshvdws vgavellteq 
gtnahwieq sptavpatpa sadrsveepp 
llahveahpa lrpvdisysl iatrtafdhr 
getdpaaltg tvrtgrtafl fsgqgsqrlg 
taldaelghp lrdiiwgeda qlvdrtgytq 
pdfvaghsig eiaaahvagv lslgdacrlv 
atedevlpll tddvsiaavn sptsvwsgy 
Irvshafhsp lmapmlddfr awesltfta 
sadywvrhvr eavrfadgir tladrgvttf 
agtipllrrd rpeeqavlaa Ichlqvlgve 
afqhrwfwpa arparpddvr aaglgaaehp 
slrthpwlad htvlgtvllp gtalvelavr 
lpedgatllq vrvgsaddtg rrtvtvharp 
paaaafdttv wppadaeplt tddcyahftt 
vlyaevalpe satdeaaafg lhpalldagl 
tlhasgatal rvrlapngpn glsvtaadpa 
tihsaltrda lfhldwtpvp lpdtansapp 
rhatlddlla gdttppatvl vplgapldgd 
atdrladsrl vfvthgavat ddapptdlaa 
ldldtepdst talsraltld epqlllragr 
sadgtvlvtg gtgglgglva rhlvrscgvr 
eleslgarw vaacdvgdgs avaelvagvs 
gsltperlaa vlrpkvdgaw nlheatrgld 
nyaagnafld almvhrvagg Ipgvslawga 
esgmplltvd qgvalfdaal atgsaalvpv 
vraplrrtaa tglatgadtg lvqrlgrldh 
hadgnaidae rafrdlgfds ltavelrnrl 
alaehlrdel fgavesevrv pvqalpptad 
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rragvssf gi 


sgthahvile 


rpeaarrpvm 


etntvepstv 




7851 


pwvlsgktpe 


alraqaakll 


ssieerpelr 


lvdvgmslvt 


grstf ehrav 


15 


7901 


vlaadradaa 


ralsaiaade 


adaaaatgrv 


gagrhavlf s 


gqgaqrlgmg 




7951 


relyerfpvf 


aealdwvdh 


ldaalpaqag 


1 r evmwgdda 


ellnetgwtq 




8001 


palf aieval 


f rlveswgvr 


pdfvaghsig 


eiaaahvagv 


f sledacrlv 




8051 


aaratlmqal 


paggamiavq 


atedevtphl 


tddvaiaain 


gpnalwsgv 




8101 


edaaveigar 


f aaegrrttr 


lhvshaf hsp 


lmdpmlaef r 


waeglsyaa 


20 


8151 


pslpwsnlt 


gqvatadelc 


saeywrhvr 


eavrfadgvt 


aleaegvrtf 




8201 


lelgpdgvla 


amagasltes 


slavpllrkd 


rpeepaalaa 


laqlhiagar 




8251 


vdwpvlf agv 


gagrvelpty 


af qrgwfwpv 


grvgvggdvg 


avglgsaghp 




8301 


llgaavelaa 


gagwltgrl 


slsshgwlad 


havmgrvfvp 


gtallemvmr 




8351 


agdevgcgrv 


eeltlaaplv 


Iperggvrvq 


vavdapdaag 


rrgvgvyscp 


25 


8401 


dgvgqavwsq 


havgvlasgv 


adqvggfgdg 


gvwppqgavs 


vdaegcyelf 




8451 


adagf gygpv 


fqglravwrr 


geelf aeval 


sdevaesadt 


atgf glnpal 




8501 


ldaslhasll 


sslegqsadg 


gpalpf aweg 


vslf asgata 


lrvrlapage 




8551 


havsvtavdp 


tgapvisida 


lrtrrltlde 


vnashtqlsd 


alf gvqwttv 




8601 


pstpaadhps 


vaiigtdhlg 


laealssssa 


gatttttaaa 


yesldaliaa 


30 


8651 


gpevsvpdvt 


liglttedai 


aqyvndhdat 


vagqgtigag 


aaavdaarrl 




8701 


taealrtiqa 


wladerlaar 


rlvfvtrgaa 


dgqdvaaaav 


qglvrsaqte 




8751 


npgtfglldl 


dgteastavl 


gealtsdepq 


lllrdghlha 


arltrlaspa 




8801 


dtavptewna 


dgtvlitggt 


gglgaqfarh 


lvdrygvrnl 


llvsrrgpda 




8851 


pgttelvael 


tahgaevavq 


acdvadgdav 


aalvagvpde 


hplrawhta 


35 


8901 


gvlddgvigs 


lteerlatvl 


rpkadaawhl 


heatrgldld 


afwf ssvag 




8951 


vf ggagqany 


aaanaf ldal 


maqrraaglp 


glslawgpwd 


qtggmtgmls 




9001 


daeadrlars 


gipplsaeqg 


lalfdaalal 


agtstpdraa 


gsaaastsgt 
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9051 gdtiaipaaa lvapvrldla 

9101 vaglvnrlsg ltaderrqel 

9151 fdsltavelr nrlstatglr 

9201 vrvpvqalpp taddpivwg 

5 9251 tnrgwdlesl ydpdpahlgt 

9301 dsqqrllles sweaieragi 

9351 egfqgqgsag svasgrvsyt 

9401 gectlalagg vtvmstpgtf 

9451 vgilvlerqs davrngheil 

10 9501 qalasgglst advdaveahg 

9551 gsiksnlght qaaagvagvi 

9601 vellteqtvw petgrvrrag 

9651 tvepvaikps aepslvpwal 

9701 ghslavtrsq fdhraivlvd 

15 9751 tggsavlftg qgaqrlgmgr 

9801 revmwgddve llnetgwtqp 

9851 iaaahvagvf sledacrlva 

9901 devaiaavng ptswisgae 

9951 mdpmlaefra vaeglsyatp 

20 10001 avrfadgitt leaegvrtfl 

10051 geestaltar ahlhtrglie 

10101 rvgvggdvga vglgsaghpl 

10151 avmgrvfapg tallenvmra 

10201 avdapdaagr rgvgvyscpd 

25 10251 vwppqgavsv daegcyelfa 

10301 devaesadta tgfglhpall 

10351 slfasgatal rvrlapageh 

10401 nashtqlsda lfgvqwttvp 

10451 veergdlaal aasehpvpdl 

30 10501 svreataqvl eqiqqwladd 

10551 wglvrsaqse npgcfglldl 

10601 palaaalhat adepqlalrg 

10651 idtrrpgtvl itggtgglgg 

10701 lvadlealga wavqtcdva 

35 10751 gvigslteer latvlrpkad 

10801 gqanyaaant fldalmaqrr 

10851 rlarsgvppi saeqglalyd 



alaaqgevpa ilrglvrtrt rrtaaggsvt 
lelvrtqaal vlghadpasv dstaqfrdlg 
ltatlvfdyp ntdalaehlr delfgavese 
macrfpggvt spedlwrlvd agtdaittfp 
sytrsggflh eagefdpaff gmsprealat 
dpltlrgsat gvfagvmysd ygsilggkef 
lgfegpavtv dtacssslva lhlaaqalra 
vefsrqrgla pdgrskafae aadgvgwseg 
avirgsavnq dgasngltap ngpsqqrvir 
tgttlgdpie aqallatygr drdpenplll 
kmvmamrhgv lpqtlhvdap sshvdwsvga 
vssfgisgtn ahvileqpea vqrlapgaae 
sgkspealra qaarlrdfla erpeprsidi 
dakapadsla alaalasgva dpawsdavs 
elygrfpvfa ealdvwdhl daalpaqagl 
alfavevalf rlverwgvrp dfvaghsige 
aratlmqalp tggamiavqa tedevtphlt 
eatqtvaqhf adqgrrttal rvshafhspl 
slpwsnltg wlatadelcs aeywvrhvre 
elgpdgilsa laqqslagea vtvpvlrkdr 
dwqdffagvg agrvelptya fqrgwfwpvg 
lgaavelaag agwltgrls lsshgwladh 
gdevgcgrve eltlaaplvl perggvrvqv 
gvgqavwsqh avgvlasgaa dqyggfgdgg 
dagfgygpvf qglravwrrg eelfaevals 
daslhaslls slegqsadgg palpfawegv 
avsvtavdpt gapvisidal rtrrltldev 
stpaadhpsv aiigtdpfgl adglsdalpl 
vlvpvagtrr tgvpadaegh tdagtsdmlr 
rfeaarlvfv trgavsvgeg giadlaasav 
dldlaldsdl apevdierdr drdpvggtvq 
gtvqaarltr ipapqtdrae tdpaetdrpe 
llarhlvaer gvrslvlasr sglaaegaek 
dgdavaalva gvsdeyplta whtagvldd 
aawhlheatr dldldafwf sslagvlgga 
aaglpgvsla wgpwdraggm tgtlsdaead 
aatagerplv vpvrldlaal rglgdvpall 
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9051 


gdtiaipaaa 


lvapvrldla 


alaaqgevpa 


ilrglvrtrt 


rrtaaggsvt 




9101 


vaglvnrlsg 


ltaderrqel 


lelvrtqaal 


vlghadpasv 


dstaqf rdlg 




9151 


fdsltavelr 


nrlstatglr 


ltatlvfdyp 


ntdalaehlr 


delf gavese 




9201 


vrvpvqalpp 


taddpivwg 


macrfpggvt 


spedlwrlvd 


agtdaittfp 


5 


9251 


tnrgwdlesl 


ydpdpahlgt 


sytrsggf lh 


eagefdpaf f 


grasprealat 




9301 


dsqqrllles 


sweaieragi 


dpltlrgsat 


gvfagvmysd 


ygsilggkef 




9351 


egfqgqgsag 


svasgrvsyt 


lgf egpavtv 


dtacssslva 


lhlaaqalra 




9401 


gectlalagg 


vtvmstpgtf 


vef srqrgla 


pdgrskaf ae 


aadgvgwseg 




9451 


vgilvlerqs 


davrngheil 


avirgsavnq 


dgasngltap 


ngpsqqrvir 


10 


9501 


qalasgglst 


advdaveahg 


tgttlgdpie 


aqallatygr 


drdpenplll 




9551 


gsiksnlght 


qaaagvagvi 


kmvmamrhgv 


lpqtlhvdap 


sshvdwsvga 




9601 


vellteqtvw 


petgrvrrag 


vssfgisgtn 


ahvileqpea 


vqrlapgaae 




9651 


tvepvaikps 


aepslvpwal 


sgkspealra 


qaarlrdf la 


erpeprsidi 




9701 


ghslavtrsq 


f dhraivlvd 


dakapadsla 


alaalasgva 


dpawsdavs 


15 


9751 


tggsavlf tg 


qgaqrlgmgr 


elygrfpvf a 


ealdvwdhl 


daalpaqagl 




9801 


revmwgddve 


llnetgwtqp 


alf avevalf 


rlverwgvrp 


dfvaghsige 




9851 


iaaahvagvf 


sledacrlva 


aratlmqalp 


tggamiavqa 


tedevtphlt 




9901 


devaiaavng 


ptswisgae 


eatqtvaqhf 


adqgrrttal 


rvshaf hspl 




9951 


mdpmlaef ra 


vaeglsyatp 


slpwsnltg 


wlatadelcs 


aeywvrhvre 


20 


10001 


avrf adgitt 


leaegvrtf 1 


elgpdgilsa 


laqqslagea 


vtvpvlrkdr 




10051 


geestaltar 


ahlhtrglie 


dwqdf f agvg 


agrvelptya 


fqrgwfwpvg 




10101 


rvgvggdvga 


vglgsaghpl 


lgaavelaag 


agwltgrls 


lsshgwladh 




10151 


avmgrvf apg 


tallemvmra 


gdevgcgrve 


eltlaaplvl 


perggvrvqv 




10201 


avdapdaagr 


rgvgvyscpd 


gvgqavwsqh 


avgvlasgaa 


dqvggfgdgg 


25 


10251 


vwppqgavsv 


daegcyelf a 


dagfgygpvf 


qglravwrrg 


eelf aevals 




10301 


devaesadta 


tgfglhpall 


daslhaslls 


slegqsadgg 


palpf awegv 




10351 


slf asgatal 


rvrlapageh 


avsvtavdpt 


gapvisidal 


rtrrltldev 




10401 


nashtqlsda 


lfgvqwttvp 


stpaadhpsv 


aiigtdpfgl 


adglsdalpl 




10451 


veergdlaal 


aasehpvpdl 


vlvpvagtrr 


tgvpadaegh 


tdagtsdmlr 


30 


10501 


svreataqvl 


eqiqqwladd 


rf eaarlvfv 


trgavsvgeg 


giadlaasav 




10551 


wglvrsaqse 


npgcfglldl 


dldlaldsdl 


apevdierdr 


drdpvggtvq 




10601 


palaaalhat 


adepqlalrg 


gtvqaarltr 


ipapqtdrae 


tdpaetdrpe 




10651 


idtrrpgtvl 


itggtgglgg 


llarhlvaer 


gvrslvlasr 


sglaaegaek 




10701 


Ivadlealga 


wavqtcdva 


dgdavaalva 


gvsdeyplta 


whtagvldd 


35 


10751 


gvigslteer 


latvlrpkad 


aawhlheatr 


dldldafwf 


sslagvlgga 




10801 


gqanyaaant 


f ldalmaqrr 


aaglpgvsla 


wgpwdraggm 


tgtlsdaead 




10851 


rlarsgvppi 


saeqglalyd 


aatagerplv 


vpvrldlaal 


rglgdvpall 
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10901 rglvrtparr taaagaapsa dvltrqlagl ggaeqeevll rlvrgqaaw 
10951 lghadgsaig agrqfqelgf dsltavefrn rlnaatglrl patllfdypt 
11001 padwghlrg rlgtgevsga gsvlaaldnl eaviaglsld dagehqlvag 
11051 rlevlrakwa dmrsaegavd ggadvdieea sdddmfalld delgln 

5 

SEP ID NO. 8 



NysE 



10 


1 


mttsteeslw 


arcf hpapaa 


pvrlf cfpha 


ggsasf yfpv 




41 


saqlssvaev 


f aiqypgrqd 


rrkeagvsdl 


atladqvyda 




81 


lrpllkerps 


tf fghsmgat 


laf evarrf e 


addgdlvrlf 




121 


asgrrapsrv 


reeavhrrsd 


dgiveelkll 


agtntallgd 




161 


eeilrmilpa 


irsdyqaiet 


yrcppdvtvr 


apltvltgdr 


15 


201 


dpktsldeae 


awrghttgdf 


dlkvlpgghf 


fvsseapaii 




241 


dllrahlagn 


g 







SEP ID NO. 9 



2 0 Ny sRl 



1 


mrkqsgssgl 


Ittlvgrdde 


lrtlarhaaa 


ardgraglvl 


lhgpagmgkt 


51 


sllrsf tasd 


vcrgmtvlyg 


tcgetvagag 


yggvrellgg 


lglsggdarr 


101 


splleglaar 


alpaltadpa 


gpdaatgayp 


vlhglywlaa 


rlmaqrplvl 


151 


vlddvhwcde 


rslawidf 11 


rraedlpllv 


vlawrseaep 


vapavladia 


201 


aqrrptvlgl 


hplgpddige 


mvrrvf rtta 


apsfvsrvaa 


vsggnplala 


251 


rlldelraeg 


vrpdaagerr 


aaevgshvla 


rsvrcllerr 


ppwvrgvara 


301 


iavlgpecte 


llaalagvpa 


atvdeallvl 


rragilaadr 


vdfvhdwrs 


351 


avlddvappt 


laelrtnaal 


llsdagrpse 


elagqlmllp 


vldqpwmaav 


401 


lrdaaaqaes 


rgapeagvrc 


lyrvlevepd 


nvavriqmar 


alaeinppea 


451 


mrllkealsl 


agdvrtraqv 


avqygf tela 


vqespsgvrm 


ledalaelta 


501 


elgpepgpvd 


relrtlvesv 


llivgadekv 


tigavrdraa 


rltmppgdtp 


551 


aqrqmlamtt 


vltamdgrda 


rsavdqarra 


lrapgvelep 


wsllsasf al 


601 


sladevadaq 


yaldlmlqyg 


qdnaavwtyv 


lalstrallh 


hgvgafpeal 


651 


adaqtaveil 


geerwadgav 


lprvalatal 


vdrgeperae 


hvldgitrpr 


701 


lerfvieyhw 


ylqarayarw 


vrgdfqgald 


lllacgrsle 


esrf snpafv 


751 


pwwadgavll 


atldrhdqar 


elaaygsela 


erwgtarglg 


laf maqgvaa 
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801 pgragidhlt eavslladsp aramearael llghahlkrd dlraarehlr 
851 aaadlaqrcg avklgvdark llvtaggrvr rmtaspldml tgmertvadl 
901 avtgasnrai aealfvtvrt iethltsvyr Jclgvggrael savletrtat 
951 sgrqppawvs qargra 

5 

SEP TP NO. 10 



KysR2 



10 


1 


vprskarnqp 


ttctpqcapd 




51 


svlsltgrpg 


haqnalvrwg 




101 


llavldgphg 


stldaairhd 




151 


wldpasltwl 


qillrhlgpd 




201 


tvpvarfwp 


altdrgvaat 


15 


251 


alrafvdhgl 


padadhlpel 




301 


avcgdlldfh rvralagahs 




351 


karviedmpa 


aeradlyvra 




401 


pllrrgf aaa 


lrredhhrac 




451 


rpeagdrrlg 


elvrstvadt 


20 


501 


gealpyagpa 


dreelvalf w 




551 


rawqlatage 


dadkarklar 




601 


avhgldtmlt 


aarsahlrsm 




651 


alpptswhpr alpnliatri 




701 


palllararv 


aaddgdweea 


25 


751 


aclklgdvte 


arrlrdeelf 




801 


rrireaaall 


rdsparlayl 




851 


aahpasrlat 


aartltvpsv 




901 


rghgnrqiae 


qlavsrrtve 




951 


das 





30 

SEQ IP NO. 11 



ahgdptmlle cgreqrligd llhrlgqgrp 
acrarhdglr vlraqatpae relrygavlq 
gppplpvpgi eevlrrtgta ptlwvedvq 
tplavlassc gdttafdtdp kapavpgppd 
vravcgtpgd eefiaaltsa tagnpailrd 
haltagwgd htvraldglp aevnavlral 
lsedrirtll asvgltvsvg dkvhirfpas 
aelthscgvn dedvahlllr ssplgapww 
aclsralqep ldprersllt lelaaaeava 
dptssgegvg vraidlgfar gnsewvrrta 
laavrdddap mipwprlpd rpvppaqaga 
ialtggvnes Immpklaaca alfatddnde 
aarifnlrar ihlcaarlea aerdldsaer 
lvsmetgrpd rarrlaeapv paggeegvww 
lrlsrecgrw lfrrhwanpa mlswrplaae 
fadrwgtasa rgiarlttrr lfdddgdrav 
wsrlsqagae tahgdtaaaa rswqavarmt 
pvatapptav vppgwrdlse aekdtvllaa 
Irlsnayrkl riggrkelyl llealegpva 



NysR3 



3 5 1 mllerenela riraaldaae agdsslllin gplgsgrsal lrripelagd 

51 gtrvlrasaa wrerdfpfgi arqlfdhlls gaggagpaer tagaehfsrl 
101 mdtgdrptgt gpalevsqav lqgaqallad asaerrllil vddlqwadgp 
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151 


slrwlahltr 


rlhglrallv 


ctladgdhrg 


ryplvrevag 


aahtvlrlap 




201 


lsrdatrvll 


agpqgrppqd 


alvravyeas 


rgnplf ltaf 


rsalratgrp 




251 


pggdhf gavr 


elsptvlrdr 


laghlriqpq 


pvrevavava 


algdhsdpvl 




301 


laqlagvdei 


gf agarralv 


dagllargrd 


vrf vhgwrd 


avdslltlde 


5 


351 


rershddaad 


llyrcgrpae 


qvaghllaw 


hpgrpwseav 


lrsaahnalr 




401 


agrpadaary 


Irrallhhrt 


qdgcrarilv 


dlataerald 


pdacvrhvsq 




451 


avalldtsrd 


raaavlripp 


sllaapspsa 


velvrqaaag 


ldepgqrdee 




501 


gadelalrle 


awlrhsghen 


pvelassvar 


lrrmgarppv 


dsvaerelva 




551 


vllsagalsg 


rlsaaeiadt 


gnrilerepa 


taahahtplp 


lvmlslfvae 


10 


601 


svqgvaswla 


seqhtrrrya 


tgaddvllta 


erafvlvtqg 


rpaaarehve 




651 


ralvindagdw 


sepavmmf aa 


vaf elrdpal 


serilerird 


rrpaglalta 




701 


tgqmlqaavd 


vhf grgrdal 


dtllacgrrl 


etvgwrnsal 


lpwrpyaigl 




" 751 


hqrlgetdaa 


lqlaedelrw 


arewgattnl 


gralrlkgwl 


lqdegldllr 




801 


esveilrass 


yatelartlv 


vlgrrlpggp 


eaeavlreaa 


giaaacgvpw 


15 


851 


laeraelglg 


saivppvatl 


tpserrvasl 


vsrgltnqai 


atelgvssra 




901 


vekhltsayr 


klgvsgrrel 


vnalpgr 








SEQ 


ID NO. 12 










20 


NysR4 












l 


visaqtapag 


esvgpglmas 


ldrdltikha 


nqef rrrf dd 


sagdvcgrsf 




51 


rdlmhpsvqq plmrqfsrli 


egkrhrf ash 


wavgaqdaa 


f agtltasav 




101 


tgktpdiagi 


lvlmdssgaa 


daadagwts 


qkkf Iteida 


rilegiaagl 


25 


151 


stiplasrly 


Isrqgveyhv 


tgllrklrvp 


nraalvsray 


smgilnvgtw 




201 


ppkwddf ik 












SEQ 


IP NO. 13 










30 


NysRS 












1 


vdaegrrrdm 


lelirrsgsa 


dwrlaeef a 


vsketvrrdl 


nvleghglir 




51 


rrhggaypmv 


rpgseavfvs 


rtaqpipees 


riataaaell 


seaetvf ide 




101 


gf tpqliada 


Iprdrpltiv 


taslpwsaf 


atspqanvll 


lggrvrrgtt 


35 


151 


atvdhwavhm 


lsgfvidlaf 


lgaegisrry 


glttpdpava 


evkaqairva 




201 


rrpvlagvht 


kfgtasfcrf 


gevgdletiv 


tgaglpvaea 


hryhlmgpkv 




251 


lrv 
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SEP TP NO. 14 



ORF2 



5 1 vaqdsgqtpr sldhvdqalv halqitpras wtrigsvlgl davtvarrwn 

51 rlvetgaawi schpapvlaa sgqgclafve idcapgrlld varalaaaph 

101 walshvtgd rdlqlnvmar dpamlsrwvt hdlaaldgvr aarthlagpv 

151 htegsrwrlr algrhqvarl aadasrhrtd tpafvldeld qqlvtalsvd 

201 gratyralae qcgagpdtvr rrvqrlfaad mlharcevar plsewpvtvs 

10 251 f wgqvpaarl . revtrrvtgm revrlcasvi srhnlhlvaw vrslddaqrf 

301 evrlaeraad ltvteraval whmkhgghll deegyrvgvt plalwreptd 

351 arrg 

SEP ID NO. 15 

15 

CRF1 



1 mqpelerlrr slhrepelgl alprtqekvl aaldglplei tpgtgltsvt 

51 avlrggrpgg avllrgdmda lpvteesgvp yaseipgrmh acghdlhtag 

20 101 lvgaarllae rrerlhgdw fmfqpgeegh ngagamieeg vldaagkpld 

151 aayalhvasn qvpggvvitr sgtitsasdl ltvtvrgegg hgstpatakd 

201 pvpaacemvt avqnwvtraf difdpwvtv gtfhagtkas vipdnrrvpg 

251 hgaqlfrirp gparrrelpr lvraiaaahg leaevdylrh ypvtvndtde 

301 tafavatard Ifapgevwes piptngsedf ayvlrrvpga mllvgaapeg 

25 351 sdwqhapmnh spravfddrv lyrqaallte laarrlaata paepavag 



Translation products of SE P ID No. 2 (Nvs 2): 



SEP ID NO. 16 

30 

KysF 

1 vielilpatv ateaayddrp rpgdrllsse reviaraves rqrefttvrh 

51 larralrrlg hpdrailpnr rgapqwppgi vgsmthcagy raaavspael 

35 101 saavsidaep ngplpagvln aialpserph lvalaahrpd vhwdrllfsa 

151 kesvfkawyp ltqreldfse aeividptqg aftarllvpg pllggrrvtv 

201 fpgrwhstpa llttavhlpa ptprrdrehr thltvnsplp rptfg 
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SEP ID NO. 17 



NysG 



5 


1 


maspddleee 


rtaprparrl 




51 


llgrvtdlia 


dgvlggvpgp 




101 


grlvasavwr 


tihelrrdar 




151 


qqtlqqtlae 


litsif silt 




201 


skraqphyaa 


qwsangtlna 


10 


251 


avyraaaJcaq 


f asgamepvm 




301 


af ilyarqf s 


qpiveiasva 




351 


tparaegrve 


f tdvsf rysp 




401 


lgnllmrfye 


pdsgrilldg 




451 


eniaygapga 


cradieeaar 


15 


501 


kqlltvaraf 


larpavlvld 




551 


ahrlstirda 


dliwmdagr 




601 


gaaag 






SEQ 


ID NO. 18 




20 










NysH 






l 


vllrllraql 


rpyawataal 




51 


ggritelgw 


mgwalvqia 


25 


101 


f sareigrfg 


tpslltrsvn 




151 


lrqdvplall 


lvalvlwav 




201 


itgvrwrsf 


vrddherarf 




251 


gf tvallwtg 


shridagrmp 




301 


arvcagriae 


vldtgssvap 


30 


351 


lrdvdltvep 


geriavlgst 




401 


eltaatlaaa 


vgfvpqrpyl 




451 


adfvarmpdg 


ldaeitqggg 




501 


aldqatdaal 


rtalvpytag 




551 


gthdvllrts 


ptyreialsq 



35 



vgllrphrrs valavsmgvg givlnafgpl 
apgidfaaig rlllvllaly waslfmlaq 
ekltrlplrh fdrqpagell srttndidnl 
mlvlmlvisp slawmllsv pvsaliaari 
hveevctgha likgfdrraa aeerfdacnd 
mfvanlgyvl- vavigawkvi ngtltlgdvq 
grlqsgiasa qrvftlldap eqapdplrpg 
dtplienlsl tvepgstvai vgptgagktt 
tdtatmtrdd lrsrfglvlq dtwlfggtia 
atcadrfirt lpqgydtvld desgtvsage 
eatssvdtrt evliqramns lragrtsfvi 
iveqgthdql lcaqglyarl haarthtpta 



valqlvqilg tlllptlgaa lidqgwrgd 
aalgaaalaa rtatamgrdl rsalfrrild 
dvqqvqnlaq tgfgiwcap lmclgsvlla 
cfglllarmg tlyarmqltl drlgrllrea 
aqtndaflvv srrvgrliat mlpwlllmn 
igslsallsy lslilmswm lafvflsvpr 
paapqpvrgp agrielcaag yrypgaeepv 
gsgkttllnl vlrladateg avrvggtdvr 
fsgtvasnlr fgrpdatdee lwealrvaqa 
nvsggqrqrl slarallrrp eiylfddcfs 
atvitvaqri sagrdadriv vldrgrwaq 
lteeeaahgl agrp 
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NysD3 

5 1 mskralitgi tgqdgsylae hllsqgyqvw glirgqanpr ksrvsrlase 

51 ldfidgdlmd qgslvsavdt vqpdevynlg aisfvpmswq qaelvtevng 

101 mgvlrmleai rmvsglstsr tvsprgqirf yqasssemfg kaaetpqret 

151 tlfhprspyg aakayghyit rnyresfgmy avsgmlfnhe sprrgqefvt 

201 rkislavari kqglqdklal gnldavrdwg yagdyvramh lmlqqdagdd 

10 251 yvigtgqmhs vrdavriafe hvglnwedyv vidpdlvrpa evevlcadsa 

301 kaqdrlgwkp dvdfptlmrm mvdsdlaqvs renqygdvll aanw 

SEP ID NO. 20 

15 Nysl aa sequence (partial) 

1 mdneqklrdy lklatadlrr trrrvhkles aaqepvaiig mtcrypggvr 

51 spedlwrmve agehgvtpfp tdrgwdleal aaaptasggf Ihdapdfdad 

101 ffgispreav amdpqqrwl esaweafera gidptsvkgs rtgvfigama 

2 0 151 qdyrvgpadg aegfqltgnt gsvlsgrisy tfgtvgpavt vdtacssslv 

201 avhlatqalr agectlalag gvtimsgpgt fiemgrqggl sadgrcrsfg 

251 dtadgtgwae gvgilvlerl sdavrnghei lawrgtavn qdgasnglta 

301 pngpsqqqvi qqalvnarla agdidweah gtgttlgdpv eaqallatyg 

351 qnrpadrpll lgsvksnlsh tqaaagvagv ikmvmamrhg tlprtlhaee 

25 401 pthhvdwsqg avrlltdttd wpatgaprra avssfgisgt nahtiieqap 

4 51 epqpedaata qddaagstpa tapwpgwp vllsgrtpda lrgqaaalra 

501 aldtgrrpdl Idlahslatt ragfehravl latdhpaltd gltaladadd 

551 paaapawitg ttraetrlav lftgqgaqrl gagrelaarf pafataldaa 

601 ldaftphldr plrevlwgtd aalldrtaya qpalfaveva lyrliesfgv 

30 651 rpdhlaghsv geivaahlag vlsladaatl vaargrlmqa Ipdggamiav 

701 qaseadvapl laghedqvai aavngpsaw Isgaeatvta laeqlaadgr 

751 ktrrlrvsha fhsplmepml dafrawedl tlqppllpw snltgkpatv 

801 aqltsadyv/v dhvrhavrfa dgidwlarhd ttaflelgpd gvlsamaqdc 

851 ldaadadavt lpalragrpe ehtlttalag lhvhgatldw tgcfagtgar 

35 901 rtdlptyafq rrrywpkalq sgtadlrsvg lgaahhplls aavsladagg 

951 tlltgrlsrq thpwladhtv rgttllpgta flelavragd evgcdrveel 
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1001 tlaaplllpe qggvqvqlwi 
1051 gvlttadasr qlpasseqgg 
1101 drladggfgy gpvfqglraa 
1151 ldaalhaapf tglgergrgg 
5 12 01 altvadgtga pvlsvdslvl 

1251 atdpgpvall gadpfgllth 
1301 agtgddaadp arsahrcaae 

13 51 rdvaaaavwg Ivrsaqsenp 

14 01 avrgdvlrva rlvrrpltev 
10 14 51 tgglgavlar hlvaeygvrd 

1501 vacdvtdraa welvgghav 
1551 vdavwhlhea trgldldafv 
1601 rraeglpgls lawgpweqsg 
1651 fdaavagtda tcvpvrldlp 
15 1701 agnlaqrlrr ldedgrdemv 

1751 fdsltavelr nrlntvtglr 
1801 atvqpaavav addpivivgm 
1851 nrgwdvesly hpdpdhpgts 
1901 sqqrllless weaieragid 
2 0 1951 gfqgsgssps lasgrvaytl 

2001 ecglalaggv tvmstpavfv 
2051 gvlvlerqsd avrngheila 
2101 alasggltag dvdweahgt 
2151 svksnlghtq aaagvagvik 
25 2201 ellseqaawp etgrvrragv 
2251 rtpgavpvll sgrgrsalrg 
23 01 feqraawaq drdqliaslg 
2351 sqraamgrel hevqpefaaa 
2401 lldetgwtqp alfavevalf 
30 2451 tledacrlva aratlmqalp 

2501 pqswisgde eaaetiaatf 
2551 vaegltyrap riplvsdltg 
2601 lrdagattfl elgsdgllta 
2651 alarlqvrgv dvdwaaylag 
35 2701 aadpadqqlw aavergdare 
2751 ekhlldtlry rvewtrlskp 
2801 algshgarvr rlllddscad 



gnpdvsgrrt vnvharpdtg ddtpwtahat 
tplagdphpa ldaaqwppag aeplpldghy 
wrggdwyae velpeagrsd aeafglhpal 
lpfswegvsl haggattlrv rltpvaddal 
rsvatqqldt aaavardalf rldwtpvqpt 
agfadapayp dlaalaaadg pvpttwlsl 
alaavqtwld hherfaaarl vfvtrgatvg 
gcfalvdldp dgavgaaalv aalvsgepql 
gagadgtgdg vgdgsgvsfs gegavlvtgg 
lllvsrsger avgagelvae lagvgarvrv 
sawhaagvl ddgmvgaltg erlsavlrpk 
vfsslagvfg spgqanyaaa nafldalmtr 
gmtgtltdvd aerlarsgvp plsvaqglal 
vlrargevpp llrslirvra rraavagsat 
ldlvrgqval vlghatggdv dagrafrdlg 
lpatlvfdyp tvrhlatyvl dellgtdaev 
acrypggvss pedlwrvlte gtdavsgfpt 
ytrsggflhe agefdpgffg msprealatd 
pvslrgsrtg vfagvmysdy samlaspefe 
glegpavtvd tacssslvam hwamqalrsg 
dfarqrglsp dgrckafada adgvgwsegv 
wrgsavnqd gasngltapn gpsqqrvirq 
gttlgdpiea qallatygrd reperplllg 
mvlamrhgw prtlhvdaps shvdwsegav 
ssfgisgtnv hviveqapga kaiaaagaar 
qaarllghlq arpdaelvdv alslattrsr 
alaadrpdpa wegeaagrg rtavlftgqg 
fdavcavfdp lldrplrew faedgsdeaa 
rlveswgvrp dfvaghsige iaaahvagvl 
tggamiaiqa tedeiaahld dtvaiaavng 
aergrktkrl rvshafhspr mdgmldafri 
rraddaevct aeywvrhvre avrfadcvrt 
maedtlgddh daelvpmlra graeelaaat 
tgarrtdlpt yafqhayywp qlptpaaala 
ladilglgeq dltpldsllp altswrrgnq 
tapvldgtwl lvasdataad qpalldglad 
ravlaerlar tadvdaatqv lsvlplderd 
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2851 


addcppltrg 


laltvalvqa 




2901 


qaaawglgrg 


valehprlwg 




2951 


dqvalratgv 


sgrrlvrhtv 




3001 


wlaragaqhl 


vltsrrgpda 


5 


3051 


aavltalpee 


lpltgwhta 




3101 


dalladreld 


f fvlf gsiag 




3151 


atsvawgpwa 


eagmaaddav 




3201 


tvtvadvdwq 


ryaplf tsar 




3251 


vtrvralaep 


eqlrlltdlv 


10 


3301 


tavelrkrlg 


aatglslpst 




3351 


atggaddepi 


aiigmacrf p 




3401 


tghlfdpdpd 


rpgttystqg 




34 51 


llettwesf e 


ragirpetlr 




3501 


sspsvlsgrl 


syvfglegpa 


15 


3551 


aggatimttp 


npf iaf srqr 




3601 


rlsdaqrngh 


pvlavlrgsa 




3651 


lapgdidale 


ahgtgtplgd 




3701 


ghtqsaagia 


svikmvmalr 




3751 


rawpetgrpr 


raavssfgis 


20 


3801 


agwpwwta 


rsaaalrgqa 




3851 


arf ehrawv 


ppagtdplaa 




3901 


qgsqwvgmgs 


qlldesavf a 




3951 


lervdwqpa 


sfavmvslaa 




4001 


lrdgarwal 


rsqaigrala 


25 


4051 


vngprsvwa 


gepealdalh 




4101 


llevlaelap 


rtsevpf f st 




4151 


adllaaeyra 


fvevsshpvl 




4201 


f llsaaevfv 


rgvdvdwagl 




4251 


vaaadpddaa 


fwtavedgdv 


30 


4301 


rstvdawryr 


vawkplggtl 




4351 


aevrrlvlde 


ecvdravlre 




4401 


lgtaltvali 


qalgdaeida 




4451 


wtaalehpqr 


wggtldlpaa 




4501 


f trrivraea 


tagrpagtwt 


35 


4551 


vlisrrgtaa 


pgaaelvael 




4601 


grtvrtwht 


aatielhtld 




4651 


ddfvlyssta 


gmwgsgahaa 



ladtgaqgrl wtatrgavst npadpvthpv 

glvdlpqvfd eragqrlagi lavkdapdge 

ealptaaeft atgtvlitgg tgglgaevar 

pgaaelrael egygpsvsw acdvadrdal 

gvghygpldt lstaefaglt aaklagaahl 

vwgsgnqsay gaanayldal alhrrargla 

setlrrqglg lldpapamte lrrawrqdv 

psaliaglpe vralaadert eqdatgasev 

rtesatvlgh ssadavpegr afrdvgfdsl 

mvfdyptple laqylraeil gavlevagpv 

ggvsspeqlw dlvasgtdai sefpvnrgwq 

gflheadefd ptffgispre alvmdpqqrl 

"stltgtfvgs syqeyglgag dgteghmvtg 

vtvdtacsss lvalhlacqs lrngesnlav 

alakdgrcka fsddadgmtl aegvgwlve 

inqdgasngl tapngpsqqr virqalanar 

pieaqalfat ygrdrdpesa lllgsvksni 

hselpptlha dapsshvdws agtvrlltqa 

gtnahvlleq apvadtpaee rpavapvpia 

erllahaetv gtalpaagpl diglslvsar 

lravatdgps pwargvadv egrtvfvfpg 

eriaecaaal aeftdwslvd vlrgwgaps 

lwrsrgvlpd awghsqgei aaawsgals 

grggmmsval svdvleprlv efegrvsvaa 

arltaddira rriavdyash shqvedlhee 

vtgdwldtar mdagywfrnl rgrvrfadav 

smavqeaide agvpavaagt lrrdqggtdr 

fegtgasrid lptyafqheh lwavppapea 

saltaalgtd edsvaavlpa ltswrrarrd 

phpsltgtwl lvtadgiddt dvagaletyg 

rlagaedvtg ivsvlaaaer tdavpgtslv 

pvwaltrgav stgradelta pvqaqvtgig 

ldaraaqrla avlsgalgsd dqlairpsgv 

prgttlvtgg sgtlaphlar wlaqrgaehl 

aesgteatva acditdrdav aalladlkad 

attladfdrv lhakvtgaqv laellddeel 

yvagnaylaa laehrrangl palslswgiw 
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4701 


addlklgrvd 




4751 


tyhpvytsgr 




4801 


eqqrtlletv 




4851 


svtgltlpst 


5 


4901 


dpiaivgmsc 




4951 


dpdrpgrtys 




5001 


afehagidpv 




5051 


lsgrvsylfg 




5101 


imstpmsfvg 


10 


5151 


ranghqvlav 




5201 


idvleghgtg 




5251 


asgvasviJcl 




5301 


tgrprraavs 




5351 


geaalraqaa 


15 


5401 


tltrglrals 




5451 


pvf adaldev 




5501 


vevalf rllt 




5551 


lmqalpegga 




5601 


dvllladlfa 


20 


5651 


ipfvsnvsgg 




5701 


elgpdgvlsa 




5751 


rvdwsgyf ad 




5801 


aaverddvaa 




5851 


wkprggatap 


25 


5901 


ltvtttdraa 




5951 


tttavqalgd 




6001 


elparfggtl 




6051 


hapagpdtar 




6101 


rgpaapgada 


30 


6151 


vfhtagwed 




6201 


sstagvlgaa 




6251 


adaaeltdrv 




6301 


af aavrplpf 




6351 


llrtqvaavl 


35 


6401 


atlvydlptp 




6451 


pf ddpiavig 




6501 


gagasdtleg 



pqmirrsgle fmdpqlalsg 
ptplfdevpe vrrltaaaeq 
rteaasvlgl ssaedltdqr 
mvfdypnpaa laaylhgela 
rypggvgsae dlwrialdev 
vqggflrdva efdpgffgis 
gqrgsrtgtf vgasyqdyas 
fegpavtldt acssslvamh 
fsrqralaed grckayadga 
irgsavnqdg asngltapng 
talgdpieaq allatygqdr 
vralqegwp kslhidrpst 
sfgisgtnvh tileqapade 
rllafveerp eahltdlahs 
dgrpdpglvq gtagrgrtaf 
larlddgpdr plrevlfaap 
swgltpdyla ghsvgelaaa 
mvaleaaede vlpllegltd 
adgrrtkrlr vshafhsplm 
lataeqvrtp dywvghvraa 
maresltdps rtallptlrg 
hgarrttlpt yafqrerywp 
laasldldda tvtamvpalt 
aaltgrwlvl vphdhqdrqd 
laariteaag dqgpfsgvls 
agidaplwnv trgavavgra 
dlpatldgqa arrlravlaa 
tafdpaagtv litggtggig 
lraeleelga rvtlaacdaa 
hwdaltpen faavlraktv 
gqgnyaaana hldalaehrr 
rrggfeplap epavrallra 
vadlpetgra tpatatgaat 
ghadprtved dhafrdlgfd 
remadfllae llgtlptdta 
igcrfpggvt tpeelwqlld 
gfltgvadfd arffgispre 



lqralddnen vlavadvdwe 
sagtvaegef aaalralsda 
afrdvgfdsl tavglrnrla 
garsaaagaa avptgapdad 
daisgfpadr gwdaeglydp 
prealsmdpq qrllletawe 
gvpnsegseg hmitgtlssv 
lacqslrnge sslalaggvs 
dgmtlaegvg lvllerlsda 
psqqrvirqa lansavapgd 
aperplllgs vksnightqm 
hvdwssgaig lltertpwpe 
aptpadpprd glvpvllsgr 
latsraaler raaviaadrd 
lftgqgsqrp gmgrelhdry 
dsaeaalldr tgyaqpalfa 
hvagvlsldd actlvaargr 
rvsvaavngp rsvwagvee 
damlddfaav argltyhppt 
vrfadgidwl atqgdvhtfl 
drpeepalvt avaaahahga 
dttaatsaht pgsaldaefw 
awrrrrgeqt eldswryrvt 
dataawaadv etalgtttvr 
llplatgdag hpgapaaltl 
eqvtapeqaa vwglgraval 
tdgedavalr psgvflrrla 
ghvarrlard gathllltsr 
drdalaalla elpddaplca 
aahhlhelta dldlaafvlf 
shgltalsva wgpwagsgmv 
i e ndd 1 1 va 1 ad i dwe r f qr 
glrqqlaelp eherpaavld 
sltilelrna lnaatglslp 
atvastaspk lsasfeqggt 
egrdgisrfp ddrgwdlaal 
alamdpqqrl llettweale 
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6551 ragidpttlr gsttgvfvgt ngqdyptllr rsasdvagyv atgntasvms 

6601 grlsyalgle gpavtidtac ssslvalhwa gralragecd lwaggvsvm 

6651 aspdsfvefs tqgglapdgr ckafsdaadg tawsegvgil vlerlsaarr 

67 01 nghqvlglir gtavnqdgas ngltapngls qqrviaqala darlrpadid 
5 6751 aieahgtgtt lgdpiearal itaygrdrda erplllgtvk snightqaaa 

6801 gaagvikmlm amrhgtlprt lhvgtpsshv dwsggtvall ddarpwprtg 

68 51 qprragvsaf gvsgtnahw veqapeteap aapaaepape atptwpww 
6901 sgrsrealqa qldrltahta ahparsaadv grslatdrtl fphravllag 
6951 pdgvreaara aaprtpgrta flfsgqgaqh almghdlyqr fpvyadaldt 

10 7001 vlaqfdtvld vplraalfaa pgtpeaalld qtgftqpalf avevalfrla 

7051 eswrltpdfv aghsigei 

SEP ID NO: 21 

15 Glu(Asp)Leu Gly Phe(Leu, Val) Asp Ser Leu 

SEP ID NO. 22 

GAG/C CTG/C GGC/G T/CTG/C GAC TCC/G CTG/C 

20 

REP ID NO. 23 

Val Asp Thr Ala Cys Ser Ser 
25 SEP ID NO. 24 

G/CGA G/CGA G/ACA/ G/CGC C/GGT GTC G/CAC 
SEP ID NO. 25 

30 

GTTGGTACCCCACTCCCGGTCCGCAC 
SEP ID NO. 26 
3 5 CCAGCCGCATGCACCACC 
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SEP ID NO. 27 
CCG CGT CGG ATC CGC CGA C 
5 SKO ID NO. 28 

AGC CTT CGA ATT CGG CGC C 
SEP ID NO: 2 9 

10 

DNA sequence of ERD4 8 . seq 





1 


gatccgccga 


cgacaccggc 


cgccgcaccg 


tcaccgtcca 


cgcccgcccc 




51 


gacgacaccg 


ccgaccgcac 


ctggacgctg 


cacgccaccg gtgtgctcgc 


15 


101 


caccacgcca 


ccggccgccg 


cggcgttcga 


caccacggtc 


tggccgcccg 




151 


ccgacgccga 


acccctcacc 


accgacgact 


gctacgcaca 


cttcaccacc 




201 


caccgcttcg 


cctacggccc 


cgccttccag 


ggcctgcggg 


ccgcctggcg 




251 


cgccggcgac 


gtgctgtacg 


ccgaggtcgc 


cctgccggag 


tccgccaccg 




301 


acgaagcggc 


cgccttcggc 


ctgcacccgg 


cgctcctgga 


cgccggcctg 


20 


351 


cacgccgcgc 


tcctcgccga 


cgaccgcgac 


accggactcc 


cgttctcctg 




401 


ggaaggcgtc 


actctgcacg 


cctccggcgc 


caccgcgcta 


cgcgtccggc 




451 


tcgccccgaa 


cggccccaac 


ggcctgtccg 


tcaccgccgc 


cgacccggcc 




501 


ggcaaccccg 


tcgccaccgt 


cacccgcctg 


ctcgcccgcc 


ccctggacgc 




551 


cgagcagttg 


accatccaca 


gcgccctgac 


ccgcgacgcg 


ctcttccacc 


25 


601 


tggactggac 


cccggtcccg 


cttcccgaca 


ccgccaactc 


cgcgccgccg 




651 


gccctcctcg 


gcccggacac 


cgccgtgctc 


gccgacgccc 


tcggcgaccc 




701 


ggccgtcgca 


cgccacgcaa 


ccctcgacga 


cctcctggcc 


ggggacacca 




751 


ccccgcccgc 


cacggtcctc 


gtccccctcg 


gcgccccact 


cgacggcgac 




801 


accgcgcagc 


acgcgcacgc 


cctcacccgc 


agcgcgctga 


ccctcgtcca 


30 


851 


gcagtggctc 


gccaccgacc 


gcctcgccga 


ctcccgcctg 


gtcttcgtca 




901 


cccacggagc 


cgtcgccacc 


gacgacgcgc 


cccccaccga 


cctggccgcc 




951 


gccgcggtct 


ggggcctgat 


ccgctccgcg 


cagaccgaga 


accccggcac 




1001 


cttcaccctc 


ctcgacctcg 


acaccgagcc 


cgactcgacc 


accgcgctca 




1051 


gccgcgccct 


gaccctcgac 


gaaccacagc 


tcctcctccg 


cgccggccgc 


35 


1101 


gcccgcgccg 


cccgcctcac 


ccgcaccccc 


gcccccacca 


ccaccaccca 




1151 


cacgccgtgg 


tccgcggacg gaacggtgtt 


ggtgacgggt 


ggtacgggtg 




1201 


gtctgggtgg 


gttggtggcc 


cggcatctgg 


tgcggtcgtg 


tggggtgcgg 
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1251 


catttgttgt 


tgaccagtcg 


1301 


gttggtcgcg 


gagttggagt 


1351 


gtgatgtggg 


tgatggctcg 


1401 


gagtcgtatc 


cgttgtctgc 


1451 


cggtgtggtg 


ggttcgttga 


1501 


cgaaggtgga 


tggtgcgtgg 


1551 


ctggacgcgt 


ttgttgtctt 


1601 


gggtcaggcc 


aactatgcgg 


1651 


ttcatcgggt 


ggctggtggg 


1701 


tgggatcagg gtgtggggat 


1751 


tcgtgctgct 


gagtcgggta 


1801 


cgttgttcga 


tgcggcgttg 


1851 


cgtctggacc 


tggccgcact 


1901 


ccgcggcctg 


gtcaaggcgc 


1951 


gcgacaccgg 


actcgccgag 


2001 


cgggacaccc 


tcctcgcgct 


2051 


ccacacctcg 


ggcgacggcg 


2101 


gcttcgactc 


gctcaccgcg 


2151 


accggcctgc 


ggctaccggc 


2201 


tgccctcgcc 


gcacacctgc 


2251 


cggaccccga 


cgagcccggc 


2301 


atcgtcatca 


tcggcatgag 


2351 


ggaggacctg 


tggcgcctgc 


2401 


tcccgaccaa 


ccgcggctgg 


2451 


gcgcacgccg 


gcacctcgta 


2501 


cgccgacttc 


gacgccgact 


2551 


ccacggactc 


ccagcagcgc 


2601 


gagcgggccg 


gcatcgaccc 


2651 


cttcgccggc 


gtcatgtaca 



30 

SEP ID NO: 30 



ttctggtgtg ggtgctgcgg gtgcggccgg 
cgttgggcgc gcgggttgtg gttgcggcgt 
gctgttgcgg agttggttgc cggtgtgtcg 
ggtggtgcat gcggctggtg tgttggatga 
cgccggagcg gttggctgcg gtgttgcgtc 
aacctgcatg aggcgacgcg tggtctggat 
ctcgtctgtt gcgggtgtgt tcgggggtgc 
cgggtaatgc gtttttggac gcgttgatgg 
ttgcctggtg tgtcgttggc gtggggtgct 
gacggcgggg ctgacggagc gggatgtccg 
tgccgttgtt gacggttgat cagggtgtgg 
gcgacgggga gtgccgcgtt ggtgccggtc 
gcgcacccgg ggcgacatcg caccgctcct 
ccatccgccg cgcagccgcc accacacccg 
cagctcaccc ggctccagcg cgccgagcga 
cgtccgcgac caggccgcga tggtcctcgg 
tcgacccgtc ccgcgccttc cgcgacctcg 
gtcgaactcc gcaaccgcat cggcgcggcc 
cacggccgtc ttcgactacc ccaccgccga 
tcaccgaact gctcggcccc gacgccgagt 
gaccccaccg cgggaccgac cgacgacccc 
ctgccgcttc cccggcgaca tcggctcgcc 
tcggcgacgg cgccgacgtc gtcaccgact 
gacctggaca acctctacga ccccgacccc 
cgcccgcacc ggcggtttcc tgcacgacgc 
tcttcggcat gagcccccgc gaggccatgg 
ctgctgttgg agtcctcgtg ggaggcgatc 
gctgaccctg cgcgacagcc gcaccggcgt 
gcggctacgg cacccgcctc gacggcgccg 



Translation product of SEQ ID NO. 29 (ER.D48 . seq) 

3 5 l saddtgrrtv tvharpddta drtwtlhatg vlattppaaa afdttvwppa 

51 daeplttddc yahftthrfa ygpafqglra awragdvlya evalpesatd 
101 eaaafglhpa lldaglhaal laddrdtglp fswegvtlha sgatalrvrl 
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1 CI 


apngpnglsv 


taadpagnpv 


atvtrllarp ldaeqltihs 


altrdal f hi 




dwtpvplpdt 


ansappallg 


pdtavladal gdpavarhat 


lddllaarft-t- 




ppatvlvplg apldgdtaqh 


ahaltrsalt 


lvqqwlatdr 


ladsrlvfvf 

J- ti ij A. A V A* V 


301 


hgavatddap ptdlaaaavw 


glirsaqten 


pgtftlldld 




351 


ral t Idepql 


1 1 ragraraa 


rltrtpaptt 


tthtpwsadg 


hvl \/ 1 - nnhnrr 
^ v x v tyy 


401 


lgglvarhlv 


rscgvrhlll 


tsrsgvgaag 


aaglvaeles 


lna y—ut/a/ a a /*■ 
i^di v v vddl, 


451 


dvgdgsavae 


lvagvsesyp 


lsawhaagv 


lddgwgslt 


pci J. etc* v ixp 


501 


kvdgawnlhe 


atrgldldaf 


wf ssvagvf 


ggagqanyaa 


^llcl J_ lUallllv 


551 


hrvagglpgv 


slawgawdqg 


vgmtaglter 


dvrraaesgm 


plltvdqgva 


601 


If daalatgs 


aalvpvrldl 


aalrtrgdia 


pllrglvkap 


irraaattpg 


651 


dtglaeqltr 


lqraerrdtl 


lalvrdqaam vlghtsgdgv 


dpsraf rdlg 


701 


fdsltavelr 


nrigaatglr 


lpatavf dyp 


tadalaahll 


tellgpdaes 


751 


dpdepgdpta 


gptddpivii 


gmscrfpgdi 


gspedlwrll 


gdgadwtdf 


801 


ptnrgwdldn 


lydpdpahag 


tsyartggf 1 


hdaadfdadf 


fgmspreama 


851 


tdsqqrllle 


ssweaierag 


idpltlrdsr 


tgvf agvmys 


gygtrldga 



SEP ID NO. 31 



CGCCGCATGCTGTTCTCACCCCACGT 

20 

SEP ID NO. 32 
GGCGCGACCCGGTTCGGCCT 
25 gEjQ NQ. 33 

GCGAGCGGCCGCTTCACCCCGCAACTCA 
SEP ID NO, 34 

30 

CGCGAAGCTTGGCCGACTGCTCGACGTC 
SEP ID NO. 35 
3 5 DNA sequence - 125401 bps (entire cluster) : 



1 gaatteggat egatgetget cgtggtgtgt gecgegggag gtgecgaegg 
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51 


tgccgtgcgt 


gaatcggggc 


gtggtcgccg 


tcgcgatcgc 


ggtggtgtcg 


101 


gtgctgctcg 


tgctgaccgt 


ggtggtggcg 


acggcctact 


tgccgtgagc 


151 


ccggtccttg 


cgccgcgacc 


tgggcagcgg 


atgcacctct 


tcggtcgacg 


201 


ccaaggggcg 


gtggcgcagg 


tcgagatagg 


cgcggacgga 


gacgccgggc 


251 


tgccccggcg 


tctgccaggg 


cgcggagggg 


ccgctggaac 


gccacagacg 


301 


gcccagctcg 


gccgcggcaa 


gggtggcggt 


ggcttcgtcg 


gcggcggtga 


351 


cgtcgatgac 


gaccatgccg 


ggttcgctga 


catgttcctc 


gtggatatgc 


401 


acagttctgt 


atgacctatt 


tcgccccgtg 


gcgtaaggag 


tagccggcag 


451 


gtttcatccg 


aaggtggggc 


gcgggagcgg 


cgagttgacc 


gtgaggtgcg 


501 


tgcggtgttc 


ccggtcgcgc 


cggggggtgg 


gggccggcag 


gtggaccgcc 


551 


gtggtcagca 


gggcgggggt 


gctgtgccag 


cggccgggga 


agaccgtgac 


601 


ccggcgaccg 


cccagcaggg 


ggcccgggac 


caggagtcgg 


gcggtgaagg 


651 


cgccctgtgt 


cgggtcgatg 


acgatctcgg 


cctcggagaa 


gtccagttcg 


701 


cgctgggtca 


ggggatacca 


cgccttgaag 


acgctctcct 


tggcgctgaa 


751 


gagcagccgg 


tcccagtgga 


cgtccggacg 


atgggccgcc 


agggccacga 


801 


ggtgcggccg 


ttccgagggc 


agggcgatgg 


cgttcaggac 


gccggccggc 


851 


agcggaccgt 


tcggttcggc 


gtcgatgctc 


accgcggccg 


acagctccgc 


901 


gggggagacg 


gcggcggcgc 


ggtagccggc 


gcagtgcgtc 


atgctgccga 


951 


cgatgcccgg 


cggccactgc 


ggggcgccgc 


gccgattggg 


cagtatggcc 


1001 


cggtccgggt 


ggccgagccg 


gcgcagggcc 


cggcgggcga 


gatggcggac 


1051 


ggtggtgaac 


tcgcgctgcc 


gggactccac 


ggcccgggcg 


atgacctcgc 


1101 


gttccgagga 


gagcagccgg 


tcgccggggc 


gcggccggtc 


gtcgtacgcc 


1151 


gcctcggtgg 


cgaccgtggc 


gggcaggatc 


agttcgatca 


ccgcatacct 


1201 


ccggcgaacg 


gtaagaacag 


ggggttgctt 


cggggcacgg 


ttccgtcctt 


1251 


gacggggcgc 


cgtgggcggc 


cgggtcagcc 


ggccgcggcg 


cccgcggtgg 


1301 


gagtgtgggt 


gcgggctgcg 


tgcagccggg 


cgtacaggcc 


ctgcgcgcac 


1351 


aggagttgat 


cgtgggtgcc 


ctgctcgacg 


atgcggccgg 


cgtccatcac 


1401 


. gacgatcagg 


tccgcgtcgc 


ggatcgtgga 


caggcggtgc 


gcgatcacga 


1451 


agctggtccg 


gcccgcccgc 


agggagttca 


tggcgcgttg 


gatcaggacc 


1501 


tcggtccggg 


tgtccacgga 


gctggtggcc 


tcgtccagca 


cgaggacggc 


1551 


cggtctggcg 


aggaaggccc 


gggccacggt 


cagcagctgc 


ttctcgccgg 


1601 


cgctgacggt 


gcccgactcg 


tcgtccagca 


cggtgtcgta 


gccctgcggc 


1651 


agggtgcgga 


tgaagcggtc 


ggcacaggtc 


gcgcgggccg 


cctcctcgat 


1701 


gtccgcacgg 


caggcaccgg 


gtgcgccgta 


cgcgatgttc 


tccgcgatgg 


1751 


tgccgccgaa 


cagccaggtg 


tcctggagca 


ccagcccgaa 


gcgggaccgc 


1801 


aggtcgtcgc 


gggtcatcgt 


cgcggtgtcg 


gtgccgtcca 


ggaggatgcg 


1851 


gccggagtcc 


ggttcgtaga 


agcgcatcag 


gaggttgccg 


agggtggttt 
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1901 


tgccggcgcc 


ggtggggccg 


acgatcgcca 


cggtgctgcc 


cggttccacg 




1951 


gtcagcgaga 


ggttctcgat 


gaggggcgtg 


tcgggggagt 


agcggaagga 




2001 


cacgtcggtg 


aactcgacgc 


ggccctcggc 


gcgggcgggc 


gtgccgggcc 




2051 


ggagcgggtc 


cggggcctgc 


tcgggggcgt 


cgagcagggt 


gaagacgcgc 


5 


2101 


tgggcggagg 


cgatgccgga 


ctggagccgg 


cccgccaccg 


aggcgatctc 




2151 


cacgatcggc 


tggctgaact 


ggcgggcgta 


gaggatgaac 


gcctgcacgt 




2201 


caccgagggt 


cagggtgccg 


tttatgacct 


tccaggcgcc 


gatgacggcc 




2251 


accagcacat 


agccgaggtt 


ggcgacgaac 


atcatgaccg 


gttccatggc 




2301 


accggaggcg 


aactgcgcct 


tggccgcagc 


ccggtagacc 


gcgtcgttgc 


10 


2351 


aggcgtcgaa 


gcgctcctcg 


gccgccgcgc 


gccggtcgaa 


gcccttgatc 




2401 


agcgcatgac 


cggtgcacac 


ctcctccaca 


tgggcgttga 


gggtgccgtt 




2451 


cgcggaccac 


tgcgcggcgt 


agtggggctg 


cgcgcgcttg 


ctgatccggg 




2501 


ccgcgatcaa 


cgccgagacc 


ggcacgctga 


gcagcatcac 


cacggccagc 




2551 


gacggcgaga 


tcaccagcat 


cagcaccagc 


atcgtcaaca 


gcgagaagat 


15 


2601 


cgaggtgatc 


agctcggcga 


gggtctgctg 


gagggtctgt 


tggaggttgt 




2651 


cgatgtcgtt 


ggtggtgcgg 


ctgagcagct 


caccggccgg 


ctgccggtcg 




2701 


aagtggcgca 


gcggcagccg 


ggtcagcttc 


tcccgggcgt 


cgcggcgcag 




2751 


ttcgtggatg 


gtgcgccaca 


ccgcggacgc 


caccagccgg 


ccctgcgcca 




2801 


gcatgaacag 


cgacgccacg 


acgtagagcg 


ccagcaggac 


cagcagcagc 


20 


2851 


cggccgatcg 


cggcgaagtc 


gatcccgggc 


gccgggcccg 


ggacgccgcc 




2901 


gagcacgccg 


tcggcgatca 


gatcggtgac 


ccggccgagc 


agcagcgggc 




2951 


cgaacgcgtt 


gagcacgatc 


ccgccgacgc 


ccatcgacac 


ggcgagtgcc 




3001 


acggagcggc 


ggtgcggacg 


cagcaggccg 


acgagccgac 


gtgccggccg 




3051 


gggcgcggtc 


cgctcctcct 


caaggtcgtc 


cggcgaggcc 


atgggcggct 


25 


3101 


tcctcctcgg 


tcagctgcga 


gagcgcgatc 


tcgcggtagg 


tcgggctggt 




3151 


gcgcagcagc 


acgtcgtggg 


tgccctgggc 


gaccacgcgc 


ccgcggtcca 




3201 


gcaccacgat 


ccggtccgcg 


tcgcggccgg 


cggagatccg 


ctgcgccacg 




3251 


gtgatcaccg 


tggcgcccgc 


ggtgtacggc 


accagcgcgg 


tccgcagcgc 




3301 


cgcatcggtg 


gcctggtcga 


gcgccgagaa 


acagtcgtcg 


aagagataga 


30 


3351 


tctccggccg 


gcgcagcaac 


gcccgggcga 


gggacaggcg 


ttggcgctgg 




3401 


ccgccggaga 


cattgccgcc 


gccctgggtg 


atctccgcgt 


cgaggccgtc 




3451 


cggcatccgc 


gccacgaagt 


cggccgcctg 


ggcgacccgc 


agcgcctccc 




3501 


acagctcctc 


gtcggtggcg 


tccggccgcc 


cgaagcgcag 


attgctcgcc 




3551 


acggtgccgg 


agaacaggta 


cggccgctgc 


ggcacgaacc 


cgacggcggc 


35 


3601 


ggcgagcgtg 


gccgcggtca 


gctcgcggac 


gtcggtgccg 


ccgacccgca 




3651 


ccgcgccctc 


ggtggcgtcg 


gccagccgca 


gcaccaagtt 


caacagggtc 




3701 


gtcttgccgc 


tgccggtgct 


gccgagcacg 


gcgatccgct 


cgcccggctc 
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3751 


gacggtcagg 


tcgacgtccc 


gcagcacggg 


ctcctcggcg 


cccgggtagc 




3801 


ggtacccggc 


cgcgcacagt 


tcgatccggc 


cggcgggccc 


gcgcaccggc 




3851 


tgcggcgcgg 


ccggcggcgc 


cacgctcgac 


ccggtgtcca 


ggacctccgc 




3901 


gatccggccg 


gcacagaccc 


gggcccgcgg 


caccgacagg 


aacacgaagg 


5 


3951 


cgagcatcac 


gacggacatc 


aggatcagcg 


agagatagct 


caggagggcg 




4001 


ctgagcgagc 


cgatcggcat 


ccggcccgcg 


tcgatccggt 


gggagccggt 




4051 


ccacagcagg 


gctacggtga 


aaccgttcat 


cagcagcagc 


acgaccggca 




4101 


gcatcgtcgc 


gatcagccga 


cccacccgcc 


gcgacaccac 


gaggaacgcg 




4151 


tcgttggtct 


gcgcgaaccg 


cgcgcgctcg 


tggtcgtcgc 


ggacgaagga 


10 


4201 


ccggaccacc 


cgcaccccgg 


tgatcgcctc 


gcgcagcagc 


cgccccagcc 




4251 


ggtccagggt 


cagctgcatc 


cgcgcgtaca 


gggtgcccat 


ccgggccagc 




4301 


agcaggccga 


agcagaccgc 


caccaccagc 


accagcgcca 


ccagcagcag 




4351 


tgccagcgga 


acgtcctggc 


gcagcgccag 


cagcacgctg 


cccaggcaca 




4401 


tcagcggcgc 


gcagacgacg 


atgccgaagc 


cggtctgggc 


gaggttctgc 


15 


4451 


acctgctgca 


cgtcgttcac 


cgaccgggtc 


agcagggagg 


gggtgccgaa 




4501 


ccggccgatc 


tcgcgggcgg 


agaagtccag 


gatgcggcgg 


aagagcgcgg 




4551 


accgcagatc 


gcggcccatc 


gccgtcgcgg 


tccgggcggc 


cagcgcggcc 




4601 


gcaccgagcg 


cggcggcgat 


ctgcaccagc 


gccaccacgc 


ccatcaccac 




4651 


acccagctcg 


gtgatgcgcc 


cgccgtcgcc 


gcgcaccacc 


ccctggtcga 


20 


4701 


taagtgcggc 


gcccagcgtc 


ggcagcagca 


aagtgcccag 


gatctggacg 




4751 


agttgaaggg 


cgacgagagc 


ggcggtggcc 


caggcgtagg 


ggcgcagctg 




4801 


tgcccgcaga 


agtctcaaca 


gcacggagga 


acaccccggt 


tgacggcggc 




4851 


gtgcggccgc 


gcggtggacg 


gagtcggccg 


gcccgccgac 


ccggtcattg 




4901 


caccgcagtc 


cgctccgaaa 


tttcactagt 


gttgggggtg 


gcaacggact 


25 


4951 


tttgacggcc 


cggtactcgt 


gaattcccta 


gaaagcccgg 


gatgcgttga 




5001 


cagcatcttc 


cgcggcttgc 


gagcgtgcgg 


gtgtctatcg 


tccggactgc 




5051 


gtattcgacg 


ccaggtcgtg 


gccgagctca 


agcgtttgga 


cggtctcttt 




5101 


cgtgttcgag 


aagggtttgc 


catgtccaaa 


cgagcgctga 


tcaccggaat 




5151 


caccggccag 


gacggctcct 


atctcgcgga 


gcacctgctg 


tcccagggct 


30 


5201 


accaggtgtg 


gggtctgatc 


cgcggccagg 


ccaatccccg 


caagtcccgg 




5251 


gtcagccgcc 


tcgcctccga 


actcgacttc 


atcgacgggg 


acctgatgga 




5301 


ccagggcagc 


ctggtctccg 


ccgtcgacac 


cgtgcagccc 


gacgaggtct 




5351 


acaacctcgg 


cgccatctcg 


ttcgtgccga 


tgtcctggca 


gcaggccgag 




5401 


ctggtcaccg 


aggtcaacgg 


catgggcgtg 


ctgcgcatgc 


tggaagccat 


35 


5451 


ccgcatggtc 


agcggactgt 


ccacctcccg 


cacggtcagc 


ccgcgcggcc 




5501 


agatccgctt 


ctaccaggcg 


tccagctcgg 


agatgttcgg 


caaggccgcc 




5551 


gagacgccgc 


agcgcgagac 


caccctcttc 


cacccgcgca 


gcccctacgg 
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5601 cgcggcaaag gcgtacgggc 

5651 tcggcatgta cgcggtctcc 

5701 cgcggccagg aattcgtcac 

5751 caagcaaggc ctccaggaca 

5 5801 gcgactgggg ctatgccggc 

5851 cagcaggacg ccggcgacga 

5901 ggtgcgcgac gcggttcgga 

5951 aggactacgt cgtcatcgac 

6001 gtgctgtgcg ccgacagcgc 

10 6051 ggacgtcgac ttccccaccc 

6101 cgcaggtttc ccgcgaaaac 

6151 tggtagcagt tctcaagctt 

6201 cgacgacact cgccacatgg 

6251 cgaatccgaa cgtaccgccg 

15 6301 cgtgcggccg tcggctgtcg 

6351 aaaaactccg ggattacctc 

6401 cggcggcgcg tccacaagct 

6451 catcggcatg acctgtcgct 

6501 tctggcgcat ggtcgaggcc 

20 6551 gaccgcggtt gggacctgga 

6601 cggattcctg cacgacgcac 

6651 cgccgcgcga ggcggtcgcc 

6701 tccgcctggg aggcgttcga 

6751 gggcagccgc accggagtct 

25 6801 tcggccccgc cgacggcgcc 

6851 agcgtgctgt ccggccgcat 

6901 cgtcaccgtc gacaccgcct 

6951 ccacccaggc gctgcgggcc 

7001 gtcaccatca tgtccggccc 

30 7051 cgggctctcc gccgacggcc 

7101 gcaccggctg ggccgaaggc 

7151 gacgccgtcc gcaacggcca 

7201 cgtcaaccag gacggcgcct 

7251 cccagcagca ggtcatccag 

35 7301 ggggacatcg acgtcgtcga 

7351 ccccgtcgag gcccaggccc 

7401 cggaccggcc gctgctgctg 



actacatcac ccgcaactac cgcgagtcct 
ggcatgctct tcaaccacga atccccgcgc 
ccgcaagatc agcctggcgg tcgcccgcat 
agctggcact cggcaacctc gacgcggtgc 
gactacgtcc gcgccatgca cctgatgctc 
ctacgtcatc ggcaccgggc agatgcactc 
tcgcgttcga acacgtcggc ctgaactggg 
cccgacctgg tgcggcccgc cgaggtcgag 
caaggcccag gaccgcctcg gctggaagcc 
tcatgcgcat gatggtcgat tccgacctgg 
caatacggcg acgtgctgct cgccgccaac 
tcgaaaacta gtgaattcct gccggaattc 
attcccccgg tgagtggcga atccaggtgg 
aaacggcgtg gagaagtcgg acgccattca 
agatgggttg agttgagatg gacaacgaac 
aagcttgcga cggccgacct tcgacgcacc 
ggagtcggcg gcccaggaac cggtggccat 
accccggcgg cgtccgcagc cccgaagacc 
ggcgagcacg gcgtcacccc gttccccacc 
ggcgctggcc gccgcgccga ccgcctccgg 
ccgacttcga cgcggacttc ttcggcatct 
atggacccgc aacagcgcgt cgtcctggaa 
acgcgccggc atcgacccga cgtccgtgaa 
tcatcggcgc gatggcccag gactaccggg 
gagggcttcc aactcaccgg caacaccggc 
ctcctacacc ttcggcacgg tcggccccgc 
gctcctcctc cctcgtcgcc gtccacctcg 
ggcgagtgca ccctcgccct cgccggcggc 
cggcaccttc atcgaaatgg gccgccaggg 
gctgccgctc cttcggcgac accgccgacg 
gtcggcatcc tcgtcctgga acggctgtcc 
cgagatcctc gccgtcgtcc gcggcaccgc 
ccaacggcct gaccgccccc aacggcccct 
caggccctgg tcaacgcccg actcgccgcc 
ggcgcacggc accggcacca ccctcggcga 
tgctcgccac ctacgggcag aaccgcccgg 
ggctcggtca agtccaacct cagccacacc 
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7451 


caggccgccg 


ccggcgtcgc 


cggcgtgatc 


aagatggtca 


tggcgatgcg 




7501 


gcacggcacc 


ctgccgcgca 


ccctgcacgc 


cgaggagccc 


acccaccacg 




7551 


tcgactggtc 


gcagggcgcc 


gtgcggctgc 


tgaccgacac 


caccgactgg 




7601 


cccgccaccg 


gggcgccgcg 


ccgcgccgcc 


gtctcctcct 


tcggcatcag 


5 


7651 


cggcaccaac 


gcccacacca 


tcatcgagca 


ggcccccgaa 


ccgcagcccg 




7701 


aggacgccgc 


gaccgcgcag 


gacgacgccg 


ccggcagcac 


gccggccacc 




7751 


gcccccgtag 


tgcccggcgt 


cgtaccggtc 


ctgctctccg 


gccgcacccc 




7801 


ggacgccctg 


cgcggccagg 


ccgcggccct 


gcgcgccgcc 


ctcgacaccg 




7851 


gccggcggcc 


cgacctgctc 


gacctcgcac 


actccctcgc 


caccacccgc 


1.0 


7901 


gccgggttcg 


agcaccgcgc 


cgtcctcctc 


gccaccgacc 


accccgccct 




7951 


gaccgacggc 


ctcaccgccc 


tcgccgacgc 


cgacgacccg 


gccgccgccc 




8001 


ccgcctggat 


caccggcacc 


acccgggccg 


agacccggct 


cgccgtcctg 




8051 


ttcaccggcc 


agggcgccca 


acgcctcggc 


gcgggacggg 


aactcgccgc 




8101 


ccgtttcccg 


gcgttcgcca 


ccgccctcga 


cgcggcgctc 


gacgccttca 


15 


8151 


ccccgcacct 


cgaccgcccc 


ctgcgcgagg 


tcctgtgggg 


caccgacgcc 




8201 


gccctgctcg 


accgcaccgc 


atacgcccag 


ccggccctct 


tcgccgtcga 




8251 


agtggcgctc 


taccggctga 


tcgaatcgtt 


cggcgtccgc 


cccgaccacc 




8301 


tcgccggcca 


ctccgtcggc 


gagatcgtcg 


ccgcgcacct 


cgccggggtc 




8351 


ctctccctgg 


ccgacgccgc 


caccctcgtc 


gccgcccgcg 


gtcgcctgat 


20 


8401 


gcaggcgctg 


cccgacggcg 


gggcgatgat 


cgccgtccag 


gcgtcggaag 




8451 


ccgacgtcgc 


cccgctgctc 


gccgggcacg 


aggaccaggt 


cgcgatcgcc 




8501 


gccgtcaacg 


gcccctccgc 


cgtcgtcctg 


tccggcgccg 


aagccaccgt 




8551 


caccgcgctc 


gccgaacagc 


tcgccgccga 


cggccgcaag 


acccgccggc 




8601 


tgcgcgtctc 


gcacgccttc 


cactcgccgc 


tcatggagcc 


gatgctcgac 


25 


8651 


gccttccgcg 


ccgtcgtcga 


agacctcacg 


ctccagccgc 


cgctcctgcc 




8701 


ggtcgtctcc 


aacctgaccg 


gcaagcccgc 


caccgtcgcc 


caactcacct 




8751 


ccgccgacta 


ctgggtcgac 


cacgtccggc 


acgccgtccg 


cttcgccgac 




8801 


ggcatcgact 


ggctcgcccg 


gcacgacacc 


accgccttcc 


tcgaactcgg 




8851 


ccccgacggc 


gtgctgtccg 


ccatggccca 


ggactgcctg 


gacgccgccg 


30 


8901 


acgcagacgc 


cgtcaccctc 


cccgccctgc 


gcgccgggcg 


ccccgaggag 




8951 


cacaccctca 


ccaccgccct 


cgccggtctg 


cacgtccacg 


gcgccaccct 




9001 


ggactggacc 


ggctgcttcg 


ccggcaccgg 


cgcccgccgc 


accgacctgc 




9051 


cgacctacgc 


cttccagcgc 


cgccgctact 


ggcccaaggc 


cctccagagc 




9101 


ggcaccgccg 


acctgcgctc 


ggtcggcctc 


ggtgccgccc 


accacccgct 


35 


9151 


gctctccgcc 


gccgtctccc 


tcgccgacgc 


aggcggcacc 


ctgctcaccg 




9201 


gccgcctctc 


ccggcagacc 


cacccctggc 


tcgccgacca 


caccgtccgc 




9251 


ggcaccaccc 


tgctgcccgg 


taccgccttc 


ctcgaactcg 


ccgtccgcgc 
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9301 cggcgacgag gtcggctgcg 
9351 cgctcctgct gcccgaacag 
9401 aaccccgacg tgtccggtcg 
94 51 caccggcgac gacaccccct 
5 9501 ccgccgacgc ctcccgccag 

9551 cccctcgccg gcgaccccca 
9601 ggccggcgcc gaaccgctgc 
9651 acggcggctt cggctacggc 
9701 cgcggcggcg acgtcgtcta 
10 9751 gtccgacgcc gaggcgttcg 

9801 tgcacgccgc gcccttcacc 
9851 ccgttctcct gggagggcgt 
9901 ccgcgtccgc ctgaccccgg 
9951 ccgacggcac cggcgcgccc 
15 10001 agcgtggcga cccaacagct 

10051 cctcttccgc ctcgactgga 
10101 ggcccgtcgc cctcctcggc 
10151 ggattcgccg acgccccggc 
10201 ggacggcccg gtcccgacca 
20 10251 acgacgcggc cgacccggcc 

10301 ctcgccgccg tacagacctg 
10351 ccgcctggtc ttcgtgaccc 
10401 ctgctgcggt gtggggtctg 
10451 tgttttgctc tggtcgatct 

2 5 10501 gctcgtcgct gcgttggtca 

10551 atgtgttgcg ggtcgcgcgt 
10601 gcgggtgctg atggcaccgg 
10651 gttctcgggt gagggtgcgg 
10701 gtgcggtgtt ggcgcgtcat 

3 0 10751 ctgttggtca gtcgcagtgg 

10801 ggcggagctt gcgggtgtgg 
10851 tgaccgatcg tgccgcggtg 
10901 gcggtggttc atgcggctgg 
10951 gaccggggag cggttgtccg 
35 11001 ggcatctaca tgaggcgacc 

11051 ttctcctccc tcgccggggt 
11101 ggccgcgaac gccttcctgg 



accgcgtcga ggaactcacc ctcgccgcac 
ggcggcgtcc aggtccagtt gtggatcggc 
ccgcaccgtc aacgtccacg cccgccccga 
ggaccgccca cgccaccggc gtcctcacca 
ctcccggctt cgtccgagca gggcggcacc 
ccccgccctc gacgcggccc agtggccccc 
cgctggacgg ccactacgac cgcctcgccg 
ccggtcttcc agggcctgcg cgccgcctgg 
cgccgaggtc gagctgcccg aggccggccg 
gcctccaccc cgccctgctc gacgccgccc 
ggcctcggcg aacgcggccg gggcggcctg 
ctccctccac gccggcggcg ccaccaccct 
tcgccgacga cgcgctcgcc ctgaccgtcg 
gtgctgtccg tcgactcgct cgtcctgcgc 
cgacacggcc gccgccgtcg cccgtgacgc 
cccccgtcca gccgaccgcc accgaccccg 
gccgacccct tcggcctgct cacccacgcc 
atacccggac ctcgccgccc tcgccgcggc 
ccgtcgtgct gtccctcgcc ggcaccgggg 
cggtccgcac accgctgcgc cgcggaggcc 
gctcgaccac catgagcgct tcgccgccgc 
gtggtgcgac ggtcgggcgt gatgttgctg 
gtgcgttcgg cgcagtcgga gaatccgggg 
ggatccggat ggtgcggtgg gtgcggctgc 
gtggtgagcc gcagcttgcg gtgcgcggtg 
ctggtgcggc ggccgctcac cgaggtcggt 
ggatggcgtc ggggatggct ctggtgtgtc 
tcctggtcac tggtggtacg ggtggtctgg 
ctggtggccg agtatggggt gcgggatctg 
tgaacgtgcc gtgggtgctg gggagttggt 
gtgcgcgggt gcgggtggtt gcgtgtgatg 
gtggagttgg ttggcgggca tgcggtgtcc 
tgtgctggat gacggcatgg tgggtgcgtt 
cggtgctgcg gccgaaggtg gatgctgtct 
cgcggcctgg acctggacgc gttcgtcgtc 
cttcggcagt cccggccagg ccaactacgc 
acgcgctgat gacgcggcgc cgggcggagg 
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gactgcccgg 


cctgtcactc 




11201 


atgacgggca 


ccctgacgga 




11251 


tgtcccgccg 


ctctccgtgg 




11301 


tggccgggac 


cgacgccacc 


5 


11351 


ctccgcgcac 


ggggtgaagt 




11401 


ccgggcgcgc 


cgagccgccg 




11451 


cccagcgcct 


gcgccgcctg 




11501 


gacctcgtcc 


gcggtcaggt 




11551 


cgacgtcgac 


gccggccgtg 


10 


11601 


ccgccgtcga 


actgcgcaac 




11651 


cccgccaccc 


tggtcttcga 




11701 


cgtcctggac 


gagttgttgg 




11751 


cggccgcggt 


tgcggtggcg 




11801 


tgccgctacc 


ccggtggcgt 


15 


11851 


caccgaaggc 


accgacgccg 




11901 


acgtcgaatc 


cctctatcac 




11951 


acgcgctcgg 


gtgggttcct 




12001 


cttcgggatg 


agtccgcggg 




12051 


tgttgttgga 


gtcgtcgtgg 


20 


12101 


gtgagtttgc 


ggggtagtcg 




12151 


cgattacagc 


gcgatgttgg 




12201 


gtgggagttc 


gccgagtttg 




12251 


ttggaaggcc 


cggcggtgac 




12301 


ggcgatgcac 


tgggcgatgc 


25 


12351 


cgttggccgg 


tggtgtgacg 




12401 


tttgctcggc 


agcggggttt 




12451 


ggatgcggcc 


gatggtgtgg 




12501 


tggagcggca 


gtcggacgcg 




12551 


gtgcggggtt 


cggcggtcaa 


30 


12601 


gcccaatggt 


ccgtcgcagc 




12651 


gtggtctgac 


ggccggtgac 




12701 


acgacgctcg 


gtgatccgat 




12751 


gcgggatcgt 


gagcctgagc 




12801 


atctggggca 


tacgcaggct 


35 


12851 


gtgttggcga 


tgcggcatgg 




12901 


gccttcttcg 


catgtggact 




12951 


agcaggcggc 


ctggccggag 
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gcatggggac cgtgggagca gtcgggcgga 
cgtcgacgcc gaacggctgg cccgctccgg 
cgcagggcct ggccctcttc gacgctgccg 
tgcgttccgg tccgcctgga cctccccgtc 
gccgccgctg ctgaggtcgt tgatccgcgt 
tcgccggctc cgccaccgcg ggcaacctcg 
gacgaggacg gccgcgacga gatggtcctg 
cgccctcgtc ctcggccacg cgaccggtgg 
ccttccgcga cctcggcttc gactcgctga 
cgcctcaaca ccgtcaccgg cctgcgcctg 
ctacccgacc gtccggcacc tcgccacgta 
gcacggatgc cgaggtggcg accgtgcagc 
gacgatccga tcgtcatcgt gggcatggcc 
cagctccccc gaggacctgt ggcgcgtgct 
tctcgggctt cccgaccaac cgtggttggg 
ccggaccctg accacccggg tacctcctac 
gcatgaggcg ggggagttcg atccggggtt 
aggcgttggc gaccgattcc cagcagcggt 
gaggcgatcg agcgggccgg tattgatccg 
gacgggtgtg ttcgcggggg tgatgtacag 
cgagtccgga gttcgagggt ttccagggca 
gcgtcgggtc gggtggccta cacgttgggg 
ggtggatacg gcgtgttcgt cgtcgttggt 
aggcgttgcg tagtggtgag tgtgggttgg 
gtgatgtcga cgcctgcggt gtttgtggac 
gtcgccggat ggccggtgca aggcgtttgc 
gctggtccga gggcgtcggc gtgttggtcc 
gtgcgcaatg gtcacgagat tttggctgtg 
ccaggatggt gcgtccaatg gtttgacggc 
agcgggtgat ccggcaggcg ttggccagtg 
gtggatgtgg tggaggcgca tggtacgggt 
cgaggcgcag gcgttgttgg cgacgtatgg 
ggccgttgtt gttgggttcg gtgaagtcga 
gctgcgggtg tggcgggcgt tatcaagatg 
tgtggtgccg cggacgttgc atgtggatgc 
ggtccgaggg tgcggtggag ctgctcagtg 
acgggtcggg tgcggcgggc gggtgtctcc 
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13001 


tccttcggca 


tcagcggtac 


caatgtgcat 


gtcatcgtcg 


agcaggcgcc 




13051 


gggcgccaag 


gcgatcgccg 


cggccggtgc 


ggcgcggcgc 


acgccgggtg 




13101 


ccgtgccggt 


gctgctctcg 


gggcgtggcc 


ggagtgccct 


gcggggccag 




13151 


gccgcccgcc 


tgctcggaca 


cctccaggcc 


cgacccgacg 


ccgaactcgt 


5 


13201 


cgatgtcgca 


ctgtcgttgg 


cgaccacccg 


gtcccgcttc 


gagcagcggg 




13251 


ccgccgtcgt 


ggcgcaggac 


cgcgaccagc 


tgatcgcctc 


gctgggggcg 




13301 


ctggccgccg 


accgccccga 


ccccgccgtc 


gtcgagggcg 


aggccgccgg 




13351 


acgcggccgg 


accgcggtgc 


tgttcaccgg 


acagggcagc 


cagcgggccg 




13401 


ccatggggcg 


tgaactccac 


gaggtgcagc 


cggagttcgc 


cgcggcgttc 


10 


13451 


gacgcggtgt 


gtgccgtttt 


cgacccgctg 


ttggaccggc 


cgctgcgcga 




13501 


ggtggtgttc 


gccgaggacg 


gcagcgacga 


ggccgcactg 


ctggacgaga 




13551 


ccggttggac 


gcagccggct 


ctgttcgccg 


tcgaggtggc 


gctgttccgg 




13601 


ctggtggaga 


gttggggtgt 


ccgtccggac 


ttcgtggccg 


gccattccat 




13651 


cggtgagatc 


gcggcggcgc 


acgtcgccgg 


ggtgctgacg 


ttggaggacg 


15 


13701 


cctgccgtct 


ggtggccgcg 


cgggcgacgc 


tgatgcaggc 


gctgccgacc 




13751 


ggcggcgcga 


tgatcgcgat 


ccaggccacc 


gaggacgaga 


tcgcggcgca 




13801 


cctcgacgac 


acggtggcga 


tcgccgccgt 


caacgggccg 


cagtccgtgg 




13851 


tgatctccgg 


tgacgaggag 


gccgccgaaa 


cgatcgccgc 


cacgttcgcc 




13901 


gaacgcgggc 


gcaagaccaa 


gcggctgcgg 


gtgagccatg 


ccttccactc 


20 


13951 


gccgcggatg 


gacgggatgc 


tggacgcctt 


ccggatcgtc 


gccgaggggc 




14001 


tgacctaccg 


ggcgccgcgc 


atcccgctcg 


tctccgacct 


caccggccgg 




14051 


cgcgccgacg 


atgcggaggt 


gtgcaccgcg 


gagtactggg 


tgcggcacgt 




14101 


ccgagaggcc 


gtgcggttcg 


ccgactgcgt 


gcggacgctg 


cgcgacgccg 




14151 


gggccaccac 


cttcctggaa 


ctgggctccg 


acggcctgct 


gaccgcgatg 


25 


14201 


gccgaggaca 


ccctcggtga 


cgaccacgac 


gccgaactgg 


tgccgatgct 




14251 


gcgcgccggg 


cgcgccgagg 


aactggccgc 


ggccaccgcc 


ctggcccgcc 




14301 


tccaggtgcg 


cggcgtggac 


gtggactggg 


cggcgtacct 


cgccggcacc 




14351 


ggcgcccgac 


gcaccgacct 


gccgacctac 


gccttccagc 


acgcgtacta 




14401 


ctggccgcag 


ctgccgaccc 


cggccgccgc 


cctcgccgcc 


gccgatcccg 


30 


14451 


ccgaccagca 


gctgtgggcc 


gctgtggagc 


gcggcgacgc 


ccgcgaactc 




14501 


gccgacatcc 


tcggcctggg 


cgaacaggac 


ctcacgccgc 


tggactccct 




14551 


gctgcccgcc 


ctcacctcgt 


ggcggcgcgg 


caaccaggag 


aagcacctcc 




14601 


tggacaccct 


gcgctaccgc 


gtggagtgga 


cacgactgag 


caagccgacc 




14651 


gccccggtcc 


tcgacggcac 


ctggctgctg 


gtcgcctccg 


acgccaccgc 


35 


14701 


ggccgaccag 


ccagccctcc 


tcgacggcct 


ggccgacgcc 


ctcggctcgc 




14751 


acggcgcgcg 


ggtgcgtcgc 


ctgcttctgg 


acgactcctg 


cgcggaccgc 




14801 


gcggtgctcg 


ccgaacgact 


ggcgcggacc 


gccgacgtgg 


acgccgcgac 
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14851 ccaggtgctg tccgtgctgc 
14901 cgccgctcac ccgcggactg 
14951 gccgacaccg gcgcccaggg 
15001 ctccaccaac cccgccgacc 
5 15051 ggggcctggg ccggggcgtc 

15101 ctggtcgacc tgccgcaggt 
15151 cgggatcctc gccgtcaagg 
15201 tgcgggccac cggagtctcc 
15251 gcgctgccca cggccgcgga 
10 15301 tggtggcacc ggtggcctgg 

15351 ccggcgccca gcacctcgtc 
15401 ggcgccgccg aactccgggc 
15451 cgtcgtcgcc tgcgacgtcg 
15501 ccgcactgcc cgaggaactg 
15 15551 gtcggccact acggcccgct 

15601 cctcaccgcc gccaagctcg 
15651 ccgaccgcga actggacttc 
15701 tggggcagtg gcaaccagag 
15751 cgcgctcgcc ctgcaccgcc 
2 0 15001 cctggggccc gtgggccgag 

15851 gagaccctgc gccgccaggg 
15901 gaccgagctg cgccgcgccg 
15951 ccgacgtgga ctggcagcgc 
16001 agcgccctga tcgccggcct 
2 5 16051 gcgcaccgag caggacgcca 

16101 gcgccctggc cgaacccgag 
16151 accgagtccg ccaccgtcct 
16201 gggccgcgcc ttccgcgacg 
16251 tccgcaagcg cctgggcgcc 
30 16301 gtcttcgact acccgacacc 

16351 gatcctcggc gcggtgctgg 
164 01 ccgacgacga gccgatcgcc 
164 51 ggcgtcagct ccccggaaca 
16501 cgcgatcagc gagttccccg 
35 16551 tcgacccgga ccccgaccgg 

16601 ttcctccacg aggccgacga 
16651 ccgcgaggcg ctggtcatgg 



cgctcgacga gcgggacgcc gacgactgcc 
gcgctgaccg tcgcgctcgt ccaggccctc 
ccggctgtgg accgccaccc gcggcgccgt 
cggtcaccca ccccgtccag gccgctgcct 
gccctggagc acccacggct gtggggcggc 
cttcgacgag cgggccggac agcggctcgc 
acgcaccgga cggcgaggac caggtggcgc 
ggccgccggc tcgtccgcca caccgtcgaa 
gttcaccgcc accggcactg tcctgatcac 
gcgccgaggt cgcccggtgg ctggcccgcg 
ctgaccagcc gccgcggccc ggacgcgccg 
cgaactggag ggctacgggc cgtcggtgtc 
ccgaccggga cgcgctcgcc gccgtcctca 
ccgctgaccg gtgtcgtgca caccgcaggc 
ggacaccctg agcaccgccg agttcgccgg 
ccggcgccgc ccacctcgac gccctgctcg 
ttcgtcctct tcggctccat cgccggtgtc 
cgcctacggc gccgccaacg cctacctcga 
gcgcccgcgg cctcgccgcg acctccgtcg 
gccggcatgg ccgccgacga tgccgtttcc 
cctcggcctg ctcgacccgg ccccggccat 
tcgtccggca ggacgtcacc gtcaccgtcg 
tacgcaccgc tgttcacctc cgcccggccc 
gcccgaggtc cgcgccctcg ccgccgacga 
ccggcgcctc cgaggtcgtc acccgcgtcc 
caactgcgcc tgctgaccga cctcgtccgc 
cggccacagc tccgccgacg ccgtgcccga 
tcggcttcga ctcgctgacc gcggtcgagc 
gcgaccgggc tgtccctgcc cagcaccatg 
gctggaactc gcccagtacc tgcgggcgga 
aagtcgccgg cccggtcgcc accggcggcg 
atcatcggca tggcctgccg cttccccggc 
gctgtgggac ctggtcgcct ccggcaccga 
tcaaccgcgg ctggcagacc gggcacctct 
cccggcacca cctactccac ccagggcggc 
gttcgacccc accttcttcg gcatctcgcc 
acccgcagca gcggctcctg ctggagacca 
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16701 


cctgggagtc 


cttcgagcgc 


gccgggatcc 


gcccggaaac 


cctccgatcc 




16751 


accctgaccg 


gcaccttcgt 


cggctccagc 


taccaggagt 


acggcctggg 




16801 


cgcgggcgac 


ggcaccgagg 


gccacatggt 


caccggcagc 


agccccagtg 




16851 


tgctctccgg 


ccgactgtcg 


tacgtcttcg 


gtctggaagg 


cccggcggtc 


5 


16901 


accgtcgaca 


ccgcctgctc 


gtcctcgctc 


gtggcgctgc 


acctggcctg 




16951 


ccagtcgctg 


cgcaacggcg 


agagcaacct 


ggccgtcgcc 


ggcggcgcca 




17001 


cgatcatgac 


gacgcccaac 


ccgttcatcg 


cgttcagccg 


gcagcgcgcc 




17051 


ctcgccaagg 


acggccgctg 


caaggcgttc 


tccgacgacg 


cggacggcat 




17101 


gacgctcgcc 


gagggcgtcg 


gcgtcgtcct 


cgtcgagcgg 


ctctccgacg 


10 


17151 


cgcagcgcaa 


cggccacccg 


gtcctggccg 


tcctccgcgg 


ctccgccatc 




17201 


aaccaggacg 


gcgcctccaa 


cggcctgacc 


gcgcccaacg 


gcccgtccca 




17251 


gcagagggtc 


atccgccagg 


ccctcgccaa 


cgcccgcctc 


gcgcccgggg 




17301 


acatcgacgc 


cctggaggcg 


cacggcaccg 


gcacaccgct 


cggcgacccc 




17351 


atcgaggccc 


aggcactgtt 


cgccacctac 


ggccgcgacc 


gcgaccccga 


15 


17401 


gagcgcgctg 


ctgctcggct 


cggtgaagtc 


caacatcggc 


cacacccagt 




17451 


ccgccgcggg 


catcgccagc 


gtcatcaaga 


tggtcatggc 


gctgcgccac 




17501 


tccgaactgc 


cgccgaccct 


gcacgccgac 


gcgccgtcct 


cgcacgtgga 




17551 


ctggtcggcc 


gggacggtcc 


ggctgctgac 


ccaggcgcgc 


gcctggccgg 




17601 


agaccggtcg 


cccgcgccgg 


gccgcggtgt 


cctcgttcgg 


catcagcggt 


20 


17651 


accaacgccc 


atgtcctgct 


ggagcaggcg 


cccgtcgcgg 


acaccccggc 




17701 


cgaggagcgg 


cccgccgtgg 


cgccggtccc 


gatcgccgcc 


ggcgtcgtcc 




17751 


cgtgggtggt 


caccgcccgc 


agcgccgccg 


ccctgcgcgg 


ccaggccgag 




17801 


cgcctcctcg 


cgcacgccga 


aaccgtcgga 


accgccctgc 


cggccgccgg 




17851 


accgctcgac 


atcggcctgt 


cgctggtctc 


cgcgcgcgcc 


cgtttcgagc 


25 


17901 


accgtgccgt 


cgtcgtcccg 


cccgcgggca 


ccgacccgct 


ggccgcgctg 




17951 


cgcgccgtcg 


cgacggacgg 


gccctcgccc 


gtggtcgccc 


gtggcgtggc 




18001 


ggacgtcgag 


ggtcggacgg 


tgttcgtgtt 


ccccggtcag 


ggttcgcagt 




18051 


gggtggggat 


ggggtcccaa 


ctccttgatg 


agtcggcggt 


gttcgcggag 




18101 


cggattgccg 


agtgtgcggc 


ggcactcgcc 


gagttcaccg 


actggtcgct 


30 


18151 


ggtcgatgtg 


ctgcggggtg 


tggtgggtgc 


gccgtcgttg 


gagcgggtcg 




18201 


atgtggtgca 


gccggcgtcg 


ttcgcggtga 


tggtgtcgtt 


ggctgcgttg 




18251 


tggcgttccc 


gtggtgtgtt 


gccggatgcg 


gtggtggggc 


attcgcaggg 




18301 


tgagatcgct 


gctgcggtgg 


tgtcgggtgc 


gctgtcgttg 


cgggacgggg 




18351 


cgcgggtggt 


ggcgctgcgg 


agtcaggcca 


ttggtcgtgc 


gttggcgggg 


35 


18401 


cggggaggga 


tgatgtccgt 


cgcgctgtcg 


gtggacgtgc 


tcgaaccgcg 




18451 


gttggtcgag 


ttcgaggggc 


gggtgtcggt 


ggccgccgtc 


aacggcccgc 




18501 


gctccgtcgt 


ggtcgccggc 


gagcccgagg 


cgctggacgc 


gctgcacgcc 
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18551 


cggctgaccg 


ccgacgacat 


ccgggcccgc 


cggatcgcgg 


tggactacgc 




18601 


-ctcgcactcg 


caccaggtcg 


aggacctgca 


cgaggaactg 


ctggaggtgc 




18651 


tggcggagct 


ggcgccgcgc 


acgtcggagg 


tgccgttctt 


ctcgaccgtg 




18701 


accggcgact 


ggctggacac 


cgcgcggatg 


gacgccggct 


actggttccg 


5 


18751 


caacctgcgc 


ggacgggtgc 


ggttcgcgga 


cgcggtggcg 


gacctgctgg 




18801 


cggcggagta 


ccgcgcattc 


gtcgaggtca 


gctcgcaccc 


ggtgctgtcg 




18851 


atggcggtgc 


aggaggcgat 


cgacgaggcc 


ggcgtgccgg 


ccgtcgccgc 




18901 


cggcaccctg 


cgccgcgacc 


agggcggcac 


cgaccgcttc 


ctgctgtcgg 




18951 


ccgccgaggt 


cttcgtgcgc 


ggtgtggacg 


tggactgggc 


ggggctgttc 


10 


19001 


gaggggaccg 


gtgcgtcccg 


gatcgacctg 


cccacctacg 


ccttccaaca 




19051 


cgaacacctg 


tgggccgtcc 


cgcccgcccc 


ggaggccgtc 


gccgccgccg 




19101 


acccggacga 


cgcggccttc 


tggaccgcgg 


tcgaggacgg 


tgacgtctcc 




19151 


gcgctgaccg 


ccgcgctcgg 


caccgacgag 


gactccgtcg 


ccgccgtgct 




19201 


gcccgccctg 


acctcctggc 


gccgggcccg 


ccgcgaccgc 


tccaccgtgg 


15 


19251 


acgcctggcg 


ctaccgcgtc 


gcctggaaac 


ccctcggcgg 


caccctgccg 




19301 


cacccgtccc 


tgaccggcac 


ctggctgctg 


gtcaccgccg 


acggcatcga 




19351 


cgacaccgat 


gtggcagggg 


cgttggagac 


ctacggcgcc 


gaggtgcgcc 




19401 


ggctggtcct 


ggacgaggag 


tgcgtcgacc 


gcgccgtcct 


gcgggagcgg 




19451 


ctggccggcg 


cggaggacgt 


gaccggcatc 


gtctccgtcc 


tcgccgccgc 


20 


19501 


cgagcggacc 


gacgcggtac 


cgggcacctc 


cctggtgctc 


ggcaccgccc 




19551 


tgaccgtggc 


actgatccag 


gccctcggcg 


acgccgaaat 


cgacgctccc 




19601 


gtatgggcgt 


tgacccgcgg 


cgcggtctcc 


accggccggg 


ccgacgagct 




19651 


gaccgcgccc 


gtccaggcac 


aggtcaccgg 


catcggctgg 


accgcggcgc 




19701 


tggagcaccc 


gcagcgctgg 


ggcggcaccc 


tcgacctgcc 


cgccgccctc 


25 


19751 


gacgcccggg 


ccgcccagcg 


gctcgccgcc 


gtgctgtccg 


gcgccctcgg 




19801 


cagcgacgac 


cagctggcca 


tccggccctc 


cggggtcttc 


acccgccgca 




19851 


tcgtgcgggc 


cgaggccacc 


gccgggcggc 


ccgccggcac 


ctggacgccg 




19901 


cgcggcacca 


cactggtcac 


cggcggctcc 


ggcaccctcg 


ccccgcacct 




19951 


cgcccgctgg 


ctggcccaac 


gcggcgccga 


gcacctggtc 


ctgatcagcc 


30 


20001 


ggcgcggcac 


ggccgccccg 


ggcgccgccg 


aactcgtcgc 


ggaactggcc 




20051 


gagtcgggca 


ccgaggcgac 


cgtcgccgcc 


tgcgacatca 


ccgaccgcga 




20101 


cgcggtcgcc 


gcgctgctgg 


ccgacctcaa 


ggccgacggg 


cgcaccgtcc 




20151 


gcaccgtcgt 


gcacaccgcc 


gccaccatcg 


agctgcacac 


cctggacgcc 




20201 


accaccctgg 


cggacttcga 


ccgggtgctg 


cacgccaagg 


tcaccggcgc 


35 


20251 


ccaggtcctc 


gccgaactgc 


tcgacgacga 


agagctggac 


gacttcgtcc 




20301 


tgtactcctc 


caccgccggc 


atgtggggca 


gcggcgccca 


cgccgcctac 




20351 


gtcgccggca 


acgcctacct 


cgccgcgctc 


gccgagcacc 


gccgggccaa 
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20401 


cggactgccc 


gccctgtcgc 




20451 


aactgggccg 


ggtcgatccc 




20501 


atggacccgc 


agctggccct 




20551 


cgagaacgtg 


ctcgcggtcg 


5 


20601 


tctacacctc 


cggccgaccc 




20651 


cgccggctca 


ccgcggccgc 




20701 


cgagttcgcc 


gccgcgctgc 




20751 


ccctgctgga 


gaccgtccgc 




20801 


tccgccgagg 


acctcaccga 


10 


20851 


ctcgctgacc 


gccgtcggcc 




20901 


tgacgctgcc 


ctcgacgatg 




20951 


gccgcctatc 


tgcacggcga 




21001 


cgccgccgcc 


gtcccgaccg 




21051 


tcgtcggcat 


gagctgccgc 


15 


21101 


ctgtggcgga 


tcgccctgga 




21151 


cgaccgcggc 


tgggacgccg 




21201 


ccggccgcac 


ctactccgtc 




21251 


ttcgacccgg 


gcttcttcgg 




21301 


cccgcagcag 


cggctcctgc 


20 


21351 


ccggcatcga 


cccggtcggc 




21401 


ggcgccagct 


accaggacta 




21451 


cgaaggccac 


atgatcaccg 




21501 


tgtcctacct 


cttcggcttc 




21551 


tgctcctcct 


ccctggtcgc 


25 


21601 


cggggagagc 


tcgctggccc 




21651 


cgatgtcgtt 


cgtcggcttc 




21701 


cgctgcaagg 


cgtacgcgga 




21751 


. cgtcggcctg 


gtgctgctgg 




21801 


accaggtgct 


cgccgtgatc 


30 


21851 


tccaacggcc 


tgaccgcacc 




21901 


ccaggcgctg 


gccaactccg 




21951 


agggccacgg 


caccggtacc 




22001 


ctgctcgcca 


cctacggcca 




22051 


cggctcggtg 


aagtccaaca 


35 


22101 


ccagcgtcat 


caagctcgtc 




22151 


tccctgcaca 


tcgaccggcc 




22201 


catcgggctg 


ctcaccgaac 
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tgtcctgggg catctgggcc gacgacctca 
cagatgatcc ggcgcagcgg cctggagttc 
gagcggcctg cagcgggcgc tggacgacaa 
ccgacgtgga ctgggagacc taccaccccg 
accccgctct tcgacgaggt gccggaggtc 
cgagcagagc gccgggaccg tcgccgaggg 
gcgccctgtc cgacgccgag cagcagcgca 
accgaggcgg cgtccgtcct cgggctgtcc 
ccagcgggcc ttccgcgacg tcggcttcga 
tgcgcaaccg gctcgcctcc gtcaccggcc 
gtcttcgact accccaaccc ggccgcgctc 
gctggccggc gcccggtccg ccgccgccgg 
gcgcccccga cgccgacgac ccgatcgcga 
taccccggcg gggtcggctc cgccgaggac 
cgaggtcgac gcgatctccg gcttccccgc 
agggcctcta cgacccggac cccgaccggc 
cagggcggat tcctgcgcga cgtcgccgag 
gatctcgccg cgcgaggcgc tgtcgatgga 
tggagaccgc ctgggaggcg ttcgagcacg 
cagcgcggca gccgcaccgg caccttcgtc 
cgcctccggc gtgcccaaca gcgagggctc 
gcacgctctc cagtgtgctg tccggccggg 
gagggccccg ccgtcacgct cgacaccgcc 
catgcacctg gcctgccagt ccctgcgcaa 
tggccggcgg cgtcagcatc atgtccaccc 
agccggcagc gcgccctcgc cgaggacggc 
cggcgccgac gggatgaccc tcgccgaggg 
agcggctgtc cgacgcccgc gccaacgggc 
cgcggctccg ccgtcaacca ggacggcgcc 
caacggcccg tcccagcagc gcgtcatccg 
cggtggcgcc cggcgacatc gacgtcctgg 
gccctcggcg accccatcga ggcgcaggcc 
ggaccgcgcc cccgaacggc cgctgctgct 
tcggccacac ccagatggca tccggcgtcg 
cgcgccctcc aggaaggcgt ggtgcccaag 
ctccacccac gtcgactggt cctcgggcgc 
gcaccccgtg gcccgagacc ggccggccgc 
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20401 cggactgccc gccctgtcgc 

20451 aactgggccg ggtcgatccc 

20501 atggacccgc agctggccct 

20551 cgagaacgtg ctcgcggtcg 

5 20601 tctacacctc cggccgaccc 

20651 cgccggctca ccgcggccgc 

20701 cgagttcgcc gccgcgctgc 

20751 ccctgctgga gaccgtccgc 

20801 tccgccgagg acctcaccga 

10 20851 ctcgctgacc gccgtcggcc 

20901 tgacgctgcc ctcgacgatg 

20951 gccgcctatc tgcacggcga 

21001 cgccgccgcc gtcccgaccg 

21051 tcgtcggcat gagctgccgc 

15 21101 ctgtggcgga tcgccctgga 

21151 cgaccgcggc tgggacgccg 

21201 ccggccgcac ctactccgtc 

21251 ttcgacccgg gcttcttcgg 

21301 cccgcagcag cggctcctgc 

20 21351 ccggcatcga cccggtcggc 

21401 ggcgccagct . accaggacta 

21451 cgaaggccac atgatcaccg 

21501 tgtcctacct cttcggcttc 

21551 tgctcctcct ccctggtcgc 

25 21601 cggggagagc tcgctggccc 

21651 cgatgtcgtt cgtcggcttc 

21701 cgctgcaagg cgtacgcgga 

21751 cgtcggcctg gtgctgctgg 

21801 accaggtgct cgccgtgatc 

30 21851 tccaacggcc tgaccgcacc 

21901 ccaggcgctg gccaactccg 

21951 agggccacgg caccggtacc 

22001 ctgctcgcca cctacggcca 

22051 cggctcggtg aagtccaaca 

35 22101 ccagcgtcat caagctcgtc 

22151 tccctgcaca tcgaccggcc 

22201 catcgggctg ctcaccgaac 



tgtcctgggg catctgggcc gacgacctca 
cagatgatcc ggcgcagcgg cctggagttc 
gagcggcctg cagcgggcgc tggacgacaa 
ccgacgtgga ctgggagacc taccaccccg 
accccgctct tcgacgaggt gccggaggtc 
cgagcagagc gccgggaccg tcgccgaggg 
gcgccctgtc cgacgccgag cagcagcgca 
accgaggcgg cgtccgtcct cgggctgtcc 
ccagcgggcc ttccgcgacg tcggcttcga 
tgcgcaaccg gctcgcctcc gtcaccggcc 
gtcttcgact accccaaccc ggccgcgctc 
gctggccggc gcccggtccg ccgccgccgg 
gcgcccccga cgccgacgac ccgatcgcga 
taccccggcg gggtcggctc cgccgaggac 
cgaggtcgac gcgatctccg gcttccccgc 
agggcctcta cgacccggac cccgaccggc 
cagggcggat tcctgcgcga cgtcgccgag 
gatctcgccg cgcgaggcgc tgtcgatgga 
tggagaccgc ctgggaggcg ttcgagcacg 
cagcgcggca gccgcaccgg caccttcgtc 
cgcctccggc gtgcccaaca gcgagggctc 
gcacgctctc cagtgtgctg tccggccggg 
gagggccccg ccgtcacgct cgacaccgcc 
catgcacctg gcctgccagt ccctgcgcaa 
tggccggcgg cgtcagcatc atgtccaccc 
agccggcagc gcgccctcgc cgaggacggc 
cggcgccgac gggatgaccc tcgccgaggg 
agcggctgtc cgacgcccgc gccaacgggc 
cgcggctccg ccgtcaacca ggacggcgcc 
caacggcccg tcccagcagc gcgtcatccg 
cggtggcgcc cggcgacatc gacgtcctgg 
gccctcggcg accccatcga ggcgcaggcc 
ggaccgcgcc cccgaacggc cgctgctgct 
tcggccacac ccagatggca tccggcgtcg 
cgcgccctcc aggaaggcgt ggtgcccaag 
ctccacccac gtcgactggt cctcgggcgc 
gcaccccgtg gcccgagacc ggccggccgc 
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22251 


gccgcgccgc 


cgtctcctcc 




22301 


atcctcgaac 


aggcccccgc 




22351 


gcgggacggc 


ctggtgccgg 




22401 


tgcgcgccca 


ggccgcccgc 


5 


22451 


gcccacctca 


ccgacctcgc 




22501 


ggaacgccgc 


gccgccgtca 




22551 


gcctgcgcgc 


cctgtccgac 




22601 


accgcgggac 


gcggccggac 




22651 


gcgccccggc 


atgggccgcg 


10 


22701 


acgcgctgga 


cgaggtgctg 




22751 


ctgcgcgagg 


tgctgttcgc 




22801 


ggaccggacc 


ggctacgccc 




22851 


" tgttccgcct 


gctgacgtcc 




22901 


cactccgtcg 


gcgaactcgc 


15 


22951 


ggacgacgcc 


tgcactctgg 




23001 


tgcccgaggg 


cggcgcgatg 




23051 


ctgccgctcc 


tggaggggct 




23101 


cgggccgcgg 


tccgtggtcg 




23151 


tcgccgacct 


cttcgccgcc 


20 


23201 


agccatgcct 


tccactcgcc 




23251 


cgccgtcgcc 


cgcgggctga 




23301 


cgaacgtcag 


cggcggcctg 




23351 


tactgggtcg 


ggcacgtccg 




23401 


ctggctcgcc 


acccagggcg 


25 


23451 


acggcgtgct 


cagcgccatg 




23501 


acggcactgc 


tgccgaccct 




23551 


ggtcaccgcg 


gtcgccgcgg 




23601 


gcgggtactt 


cgccgaccac 




23651 


gcgttccaac 


gcgagcggta 


30 


23701 


ccacacgccc 


ggatccgccc 




23751 


gggacgacgt 


cgccgccctc 




23801 


gtcaccgcga 


tggtccccgc 




23851 


gcagaccgaa 


ctggactcct 




23901 


gcggcgccac 


cgcacccgcc 


35 


23951 


ccgcacgacc 


accaggaccg 




24001 


cgacgtcgag 


accgccctgg 




24051 


ccaccgaccg 


cgccgcgctg 
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ttcggcatca gcggcaccaa cgtccacacc 
ggacgaggcg cccacgcccg ccgacccgcc 
tcctgctctc cggccgcggc gaggccgcgc 
ctgctcgcct tcgtcgagga gcggcccgag 
ccactccctc gccacctcgc gcgccgcgct 
tcgccgccga ccgcgacacc ctgacccgcg 
ggccggcccg accccggcct ggtccagggc 
cgccttcctg ttcaccggac agggcagcca 
aactccacga ccgctacccg gtgttcgccg 
gcccggctcg acgacggacc ggaccggccg 
cgcgcccgac tccgccgagg ccgcgctcct 
agcccgcgct gttcgccgtc gaggtcgcgc 
tggggcctga ccccggacta cctggccggc 
cgccgcgcac gtcgccggcg tgctgtcgct 
tcgccgcccg cggccggctc atgcaggcgc 
gtcgccctgg aggccgcgga ggacgaggtc 
caccgaccgg gtgtccgtcg ccgccgtcaa 
tcgccggcgt cgaggaggac gtgctcctcc 
gacgggcgcc ggaccaagcg gctgcgggtg 
gctgatggac gccatgctcg acgacttcgc 
cctaccaccc gccgacgatc ccgttcgtgt 
gccaccgccg aacaggtccg cacgcccgac 
cgccgcggtc cgcttcgccg acggcatcga 
acgtccacac cttcctggag ctcggcccgg 
gcccgggaga gcctcaccga cccgtcccgc 
gcgcggcgac cggcccgagg aacctgccct 
cccacgcgca cggcgcccgc gtcgactgga 
ggcgcgcgcc ggaccacgct gccgacctac 
ctggcccgac accaccgccg ccacgagcgc 
tcgacgccga gttctgggcc gccgtcgagc 
gccgcctccc tggacctgga cgacgccacc 
gctcaccgcc tggcgccggc gccgcggcga 
ggcgctaccg cgtcacctgg aagccgcgcg 
gccctcaccg gccgctggct cgtgctcgtc 
tcaggacgac gcgaccgcgg cctgggcagc 
gcaccaccac cgtccggctg acggtcacca 
gccgcccgga tcaccgaagc cgccggcgac 
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24101 


cagggcccgt 


tcagcggtgt 


gctgtccctg 


ctgccgctcg 


ccaccggcga 




24151 


cgccggccac 


cccggtgcgc 


ccgccgccct 


caccctcacc 


accaccgccg 




24201 


tccaggccct 


cggcgacgcc 


ggcatcgacg 


cgccgctgtg 


gaacgtcacc 




24251 


cgcggggccg 


tggccgtcgg 


ccgcgccgaa 


caggtcaccg 


cgcccgaaca 


5 


24301 


ggccgccgtc 


tggggcctgg 


gccgcgccgt 


cgccctggaa 


ctgccggccc 




24351 


ggttcggcgg 


caccctcgac 


ctgcccgcca 


ccctggacgg 


ccaggccgcc 




24401 


cgccggttgc 


gcgcggtgct 


cgcggctacc 


gacggcgagg 


acgcggtggc 




24451 


cctgcggccc 


tccggcgtct 


tcctccgccg 


cctggcccac 


gccccggccg 




24501 


gccccgacac 


cgcccgcacc 


gccttcgacc 


cggcggccgg 


caccgtcctg 


10 


24551 


atcaccggcg 


gcaccggcgg 


catcggcggc 


cacgtcgccc 


gccgcctggc 




24601 


ccgcgacggc 


gccacccacc 


tgctgctcac 


cagccgccgc 


ggcccggccg 




24651 


cccccggcgc 


cgacgcgctc 


cgcgccgaac 


tggaggaact 


gggcgcccgg 




24701 


gtcaccctcg 


ccgcctgcga 


cgccgccgac 


cgcgacgcgc 


tggccgcgct 




24751 


cctcgccgaa 


ctgcccgacg 


acgccccgct 


gtgcgcggtg 


ttccacaccg 


15 


24801 


ccggcgtcgt 


cgaggaccac 


gtcgtggacg 


cgctcacacc 


ggagaacttc 




24851 


gccgcggtgc 


tgcgcgccaa 


gaccgtcgcc 


gcccaccacc 


tgcacgagct 




24901 


gaccgccgac 


ctggacctcg 


ccgctttcgt 


gctgttctcc 


tccacggccg 




24951 


gcgtcctcgg 


cgccgccgga 


cagggcaact 


acgccgcggc 


caacgcccac 




25001 


ctggacgccc 


tcgccgaaca 


ccgccgctcc 


cacggcctga 


ccgcgctgtc 


20 


25051 


cgtcgcctgg 


ggcccgtggg 


ccggctccgg 


catggtcgcc 


gacgccgccg 




25101 


aactcaccga 


ccgggtacgg 


cgcggcggct 


tcgaaccgct 


cgcccccgaa 




25151 


ccggccgtgc 


gcgccctgct 


gcgcgccatc 


gagaacgacg 


acaccaccgt 




25201 


cgcgctcgcc 


gacatcgact 


gggagcgctt 


ccagcgcgcc 


ttcgccgcgg 




25251 


tccgcccgct 


gccgttcgtc 


gccgacctcc 


ccgagaccgg 


ccgggccacc 


25 


25301 


cccgcgaccg 


ccaccggcgc 


cgccaccggc 


ctgcggcagc 


aactcgccga 




25351 


actgccggag 


cacgagcgcc 


cggcagcggt 


cctggacctg 


ctgcgtaccc 




25401 


aggtcgccgc 


cgtcctcggc 


cacgccgacc 


cgcgcaccgt 


cgaggacgac 




25451 


cacgccttcc 


gcgacctggg 


cttcgactcg 


ctgaccatcc 


tggaactgcg 




25501 


caacgccctc 


aacgccgcca 


ccggcctgag 


cctgcccgcc 


accctggtct 


30 


25551 


acgacctgcc 


caccccgcgc 


gagatggcgg 


acttcctgct 


cgccgaactc 




25601 


ctcggcaccc 


tgcccaccga 


caccgccgcg 


accgtcgcca 


gcacggcctc 




25651 


ccccaagctc 


tcagcttcgt 


tcgagcaggg 


cggtaccccc 


ttcgacgacc 




25701 


cgatcgccgt 


catcggcatc 


ggctgccgct 


tccccggcgg 


cgtcaccacc 




25751 


ccggaggagc 


tctggcagct 


cctcgacgag 


ggccgcgacg 


gcatcagccg 


35 


25801 


cttccccgac 


gaccgcggct 


gggacctcgc 


cgcgctgggc 


gccggcgcct 




25851 


ccgacaccct 


ggagggcggc 


ttcctgaccg 


gcgtcgccga 


cttcgacgcc 




25901 


cggttcttcg 


gcatctcgcc 


ccgcgaggcg 


ctggccatgg 


acccccagca 
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25951 gcggctgctg ctggagacca 

26001 acccgaccac gctgcgcggc 

26051 ggccaggact acccgacgct 

26101 ctacgtcgcc accggcaaca 

5 26151 acgcgctcgg cctcgaaggc 

26201 tcctcgctcg tcgccctgca 

26251 gtgcgacctc gtggtggccg 

26301 ccttcgtcga gttctccacc 

26351 aaggcgttct ccgacgccgc 

10 26401 catcctcgtc ctggaacgcc 

26451 tcctcggcct gatccgcggc 

26501 ggcctgaccg cgcccaacgg 

26551 actcgccgac gcccgcctgc 

26601 acggcaccgg caccaccctc 

15 26651 accgcctacg gccgggaccg 

26701 cgtcaagtcc aacatcgggc 

26751 tcatcaagat gctgatggcg 

26801 cacgtgggca ccccgtccag 

26851 gctcctcgac gacgcgcggc 

20 26901 ccggcgtctc cgccttcggc 

26951 gagcaggccc cggaaaccga 

27001 gccggaggcc acgcccaccg 

27051 gggaagcgct ccaggcgcag 

27101 caccccgcgc gctcggcggc 

2 5 27151 cacgctcttc ccgcaccgcg 

27201 gcgaggccgc ccgcgccgcc 

27251 ctgttctccg gacagggcgc 

27301 ccagcgcttc ccggtctacg 

27351 tcgacaccgt gctggacgtc 

30 27401 ggcacccccg aggccgcgct 

27451 gctgttcgcc gtcgaggtcg 

27501 tgacgccgga cttcgtcgcc 

27551 cacgtcgccg gggtgttctc 

27 601 ccgcgcctcc ctcatgcagc 

35 27651 tggaagccac cgaggacgag 

27701 ctcgccgcgg tcaacggccc 

27751 cgccgtccgc gcggtcgccg 



cctgggaggc gctggagcgg gccggcatcg 
tccaccaccg gcgtcttcgt cggcaccaac 
gttgcgccgc tccgcctcgg acgtggccgg 
ccgccagcgt gatgtccggc cgcctgtcct 
ccggccgtca ccatcgacac cgcctgctcc 
ctgggccggc cgggcgctgc gcgccggcga 
gcggcgtctc ggtcatggcc agcccggact 
cagggcggcc tggcacccga cgggcgctgc 
cgacggcacc gcctggtccg aaggcgtcgg 
tctccgccgc ccgccgcaac ggccaccagg 
accgccgtca accaggacgg cgcgtccaac 
cctctcccag cagcgcgtca tcgcccaggc 
gccccgccga catcgacgcg atcgaggcgc 
ggcgacccga tcgaggcccg cgccctgatc 
ggacgccgaa cggccgctgc tgctgggcac 
acacccaggc cgccgccggt gccgccggcg 
atgcgccacg gcaccctgcc caggacgctg 
ccacgtcgac tggagcggcg gcaccgtcgc 
cctggccacg gaccgggcag ccgcggcgcg 
gtcagcggca ccaacgccca cgtcgtcgtc 
agcccccgcc gccccggccg ccgagccggc 
tcgtcccctg ggtcgtctcc ggacgcagcc 
ctggaccggc tcaccgcgca caccgccgcc 
ggacgtcggc cgctcgctgg ccaccgaccg 
ccgtgctgct cgccggcccg gacggggtgc 
gcgccccgca cccccggccg caccgcgttc 
ccagcacgcc ctgatgggcc acgacctgta 
ccgacgcact ggacaccgtc ctcgcccagt 
ccgctgcgcg ccgcgctgtt cgccgcgccg 
cctggaccag accggcttca cccagcccgc 
cactgttccg gctcgccgag tcctggcggc 
ggccactcca tcggcgagat cgccgccgcg 
cctggaggac gcctgcacgc tggtcgccgc 
aactgccgcg ggacggtgcg atggtggccc 
gtcgcgccgc tgctcaccga cggcgtcgca 
ccgctcggtg gtcgtcgcgg gcgccgagga 
accggctcgc cgccgacggc cgccgcaccc 
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27801 


gccggctgac 


ggtcagccac 




27851 


ctcaccgact 


tcgcccgggt 




27901 


catccccctc 


gtctccaccc 




27951 


gcacccccga 


ctactgggtg 


5 


28001 


gacggcgtgc 


gcgccctgca 




28051 


cggcccggac 


ggcgtgctca 




28101 


tcgaggccgg 


cgcgcccgcc 




28151 


gccggcgacc 


tcgccctcct 




28201 


caccggcccg 


tcctggcccg 


10 


28251 


ccgatctgcc 


cacctacgcc 




28301 


ggcgcacccg 


tcgccaccgc 




28351 


cgagacctgg 


gccccgctgc 




28401 


gcgccctggt 


cctcgtcccg 




28451 


gccgtcgccg 


acgcgctcgg 


15 


28501 


cctggccgag 


cagttgacgg 




28551 


tgtcgctgct 


cgccgccgcg 




28601 


cccgccgccc 


tcctcgccac 




28651 


gtggtgcgtc 


acccgcggcg 




28701 


ccgtcggcca 


ggccgccctg 


20 


28751 


cacccggacc 


gcttcggcgg 




28801 


gcacgccgcc 


gggctgctcg 




28851 


ccgagatcgc 


ggtccgcgcc 




28901 


acgccggccg 


ccgccgacgg 




28951 


ggttgtcggc 


ggcaccggcg 


25 


29001 


gctggctggt 


ccgcgagggc 




29051 


ggcaccacga 


ccgccgcgga 




29101 


gctcggcgcc 


cggatcaccg 




29151 


gcttcgccgc 


gctcctcgac 




29201 


gtcgtgtacg 


cgccggaggc 


30 


29251 


gtccgccgca 


ctcgcccccg 




29301 


ggccgctgga 


cgccttcgtc 




29351 


gtgcgcggcc 


gggccgccga 




29401 


cgcccgcgcc 


tgccgcgacc 




29451 


gcgcctgggc 


cgacctggtc 


35 


29501 


aacggcctgc 


cggtgatgga 




29551 


ggccgtcgcc 


gacgggtccg 




29601 


agaccttcgc 


gcccctccac 
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gccttccact cgccgctgat ggacccgatg 
cgcggagggc ctgacctacc acgagccgcg 
tcctcggcgc cccggccggc gccgaactgc 
cggcacgtcc gcgagaccgt gcggttcgcc 
cgacgccggc gccggcacct tcgtggagat 
ccgccctgac ccagcagacc ctcgacaccg 
gtcgtcgtgc cgctccagcg ccgcgaccgc 
ggagggcctg gccaccctgc acacccacgg 
cctacttcga ggccaccggc ggccaccgga 
ttccagcggg agcggtactg gcccgaactc 
cccgcaggac ccggcggcct ggcgctacca 
cggcacccga ggcggccgcg ccggccggcc 
gccgggaacc gcgacaccgc gtggatgacg 
cgccgacacc gtcacggccg aacccgacgc 
ccgccggcga cacaccctgg cgcgtcgtgg 
tccgaggggc tgcccgcgga cggcgcctgg 
cctggacgag gccggcgtgc acgcgcccct 
ccgtcgcggt cgcgggggag gccccgaccg 
tggggcctgg gccgggtcgc cgcgctggac 
cctggccgac ctgcccgccg acaccgacgc 
ccgcgcacct ggccgcgccg ggcaccgagg 
accggcgtcc acgcccgtcg cctggtccgt 
tgccacctgg ctgccgaccg gcaccgtcct 
gcaccggcac catgggcggc cgggccgccc 
gcccgccacc tcgtcctgac cgcccccgac 
caccgaggcc ctgacggccg aactggccgc 
tcgtggacca cgaccccacc gccccggacg 
ggactgcccg acgacacccc gctcaccgcg 
cgacgccgcc cccggcaccg cggccgagct 
tcaccgccct aggcgccgcc ctcaccggcc 
ctcttcggct ccatcgccgg gctctggggc 
ggccgcgtcc ggcgcctacc tcgacgcctt 
gcggcacccc ggcactggcc gtcgcctggg 
ggcccgtccc tcgccgcgca cctgcggatg 
cgcggacacc gcactgaccg ccctcagccg 
ccgccgaggc ggtcgccgac gtccgctggg 
cacgaggccc gccgcaccgc cctgttcgac 
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29651 gccctgcccg aggcccgcgg 

29701 cgaccggaag accgccgccg 

29751 ccgccgcgga ccacgacgcc 

29801 gcgaccgtcc tcggccacgc 

5 29851 cttccgcgac ctgggcttcg 

29901 agctcaccgc ggaaaccggc 

29951 caccccaacc cggccgccct 

30001 cgaggcgagc gactccgccg 

30051 ccgacgacga cgcgatcgtc 

10 30101 ggggtcacct cgcccgagga 

30151 cgcggtcggc gacttcccga 

30201 ccggcgacgg accgggccgc 

30251 gacgccaccg acttcgaccc 

30301 cctggtgatg gacccgcagc 

15 30351 ccctggagcg ggccggcatc 

30401 accggcgtct tcgtcggcgg 

30451 ggccgggcag tggcagaccg 

30501 tcgcctacac cttcggcatc 

30551 tgctcctcgt cgctggtcgc 

2 0 30601 cggcgaatgc tcgatcgcgc 

30651 cggtgggctt cgtcgagttc 

30701 cgctgccgcg ccttctccga 

30751 cgtgggcatg ctcgtcgtcg 

30801 accgcgtcct cgccgtgctc 

25 30851 tccaacggcc tgaccgcccc 

30901 ccaggccctc gccaacgccc 

30951 aggcccacgg caccggcacc 

31001 ctgctcgcca cctacggcca 

31051 gctcaagtcc aacatcggcc 

30 31101 tcatcaagat ggtcctcgcc 

31151 tacgccgaga acccctcgtc 

31201 cctgctcacc gccaggaccc 

31251 ccgccgtctc ctccttcggc 

31301 gagcagccgc cgcgcgagga 

35 31351 cccgctgccg ttcttgctct 

31401 aggcccgccg actcctggcc 

314 51 gccgacctgg cgtactccct 



cgcgctcgcg gaggccgccc gggaccgcgc 
gcgactacgg ccggtggctc gccgagcagc 
atcctgctgg cactggtcac cgagaaggcc 
cgaccacgac ctgctcgaac ccgacctgcc 
actcgctgac cgcggtcgac ctgcgcaacc 
ctcaccctgc ccgccaccct cgtcttcgac 
cgccgcccac ctgcgcgccc aactcctcgg 
caccggtggc cgcccccgtc gccctcggtg 
atcgtcggca tggcctgccg ctaccccggc 
cctgtggcag ctggtcggcg acgaggtcga 
ccgaccgcgg ctgggacctg gccgcgctcg 
agtgccaccg cccagggcgg attcctctac 
cggcctgttc ggcatctcgc cgcgcgaggc 
agcggatcct gctcgaaacg tcctgggagg 
gacccggcga cgctgcgcgg cagcggcacc 
cggctccggc gactaccggc cgccggagga 
cccagtccgc cagcctgctc tccggtcgcc 
cagggcccca ccgtgtcggt cgacaccgcc 
gctgcacctg gccgcgcagg ccctgcgcgc 
tggccggcgg cgtcaccgtg atggccaccc 
agcgcccagg gcgccctgtc gccggacggc 
cgacgccaac ggcaccggct ggtccgaagg 
aacggctctc cgacgcccgc cgcaacggcc 
cgcggctccg ccatcaacca ggacggcgcg 
cagcggcccc gcccagcagc gcgtcatccg 
gactgcgccc cgccgacatc gacgccgtcg 
aggctcggcg accccatcga ggcccaggcc 
ggaccgcgag cggcccgtgc tgctcggctc 
acacccaggc cgcctccggc gtcggcggcg 
atgcagcacg gcgaactgcc gcgctccctg 
gcacgtggac tggaccgccg gccgcgccca 
cgtggcccga ctccggtcgg ccgcgccgcg 
gccagcggca ccaacgccca cgccatcctg 
actccccgcg cgccccgcgg acgacggcgc 
ccggccgctc gcagaacgcc ctgcgcgccc 
cgcctcaccg cccaccccga cacccgggcc 
ggcgaccacc cgggccgcct tcgagcaccg 
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31501 


ggccgcgatc 


accgccaccg 


accacgacgg 


cctccgcacc 


ggcctgaccg 




31551 


ccgtcgccga 


gggcaccacc 


gccccgcaca 


ccgccgaaca 


ccacctccag 




31601 


ggcaccggaa 


agcgcgccgt 


gctcttctcc 


ggccagggct 


cccagcgcct 




31651 


gggcatgggc 


cgcgaactgc 


acgagcgcca 


cccggtgttc 


gccgaggcgt 


5 


31701 


tcgactccgt 


actggcccgc 


ctcgacgacc 


ggctcgacac 


ccccctgcgg 




31751 


gacgtcgtct 


ggggcaccga 


cgaggaggcg 


ctgcacgcca 


ccgggaacac 




31801 


ccagcccgcc 


ctgttcgccg 


tcgaagtcgc 


gctctaccgc 


ctgatcgaat 




31851 


cctggggcgt 


gcggcccgac 


ttcgtggccg 


gccactccgt 


cggcgagctc 




31901 


gccgcggccc 


acgtcgccgg 


ggtgctctcc 


ctggacgacg 


cctgccgcct 


10 


31951 


ggtcgccgcc 


cgcgccgccc 


tcatgcagcg 


cctcccggcc 


ggcggcgcca 




32001 


tgatcgccgt 


cgaggccacc 


gaggacgagg 


tcaccccgct 


cctcaccgac 




32051 


ggcgtgtccc 


tcgccgcggt 


caacggaccg 


accgccgtgg 


tcctctccgg 




" 32101 


cgcgggcgac 


gccgtgaccg 


ccctgggcca 


ggcgctggcc 


gaacggggcc 




32151 


accgcaccac 


ccggctgcgg 


gtcagccacg 


ccttccactc 


gcacctcatg 


15 


32201 


gacccgatgc 


tggcggactt 


ccgcaccgtc 


gccgagggcc 


tggaatacca 




32251 


cccgccgcgc 


atccccgtgg 


tctccaacct 


caccggggac 


gtcgccgacg 




32301 


cggccgacct 


gtgctccgcc 


gactactggg 


tgcgccacgt 


ccgcggcacc 




32351 


gtacggttcg 


ccgacggcgt 


gcgcaccatg 


gccgaccgcg 


gcgtgcacct 




32401 


cttcctcgaa 


ctcggcccgg 


acgccgtgct 


gtcggccatg 


gcccgccagt 


20 


32451 


gcgcaccgga 


cgccgtcgtc 


gtcccggccc 


tgcgccgcaa 


ccgcgacgag 




32501 


gacgagacgc 


tggtcggcgc 


cgtcgcgcga 


ctgcacgtcc 


acggcgcggg 




32551 


tccgcgctgg 


gacgcgtact 


tcgccggccg 


cggcgcccag 


tggctggacc 




32601 


ttccgacgta 


ccccttccag 


cgcggccgct 


tctggccgga 


gtcccttccg 




32651 


ggcgccgcat 


cggccgcccc 


ggcagccgga 


cagccggccg 


agaccgacgc 


25 


32701 


ggccttctgg 


gacgccgtcg 


cacaggagga 


cttcaccgca 


ttggaatccg 




32751 


tactcgacgt 


cgagagcgac 


gcactgtcca 


aggtgctgcc 


ggccctgatg 




32801 


gactggcgca 


gccgccaggc 


cgacgagtcc 


caactggcag 


gctggcgcca 




32851 


ccgcatcgtc 


tggaagcggc 


tcaccggcgc 


cgccctggca 


caccgcaagg 




32901 


cgctcagcgg 


cacctggctc 


gcggtggtcc 


ccgagggctt 


cgccgacgac 


30 


32951 


ccctgggtga 


ccaccaccct 


ggacggcctc 


ggtacccacc 


tcgtgcatct 




33001 


ggaggtcgcg 


gaggccgacc 


gggccgcgct 


ggccgacgcg 


atcgcggccc 




33051 


gcaccgccga 


cggcacccgc 


ttcggcggcg 


taatctccct 


gctggccctg 




33101 


cgcgaggagc 


tcaccggcgc 


ggtgcccgag 


gggaccgccc 


tgaccaccac 




33151 


cctcctccag 


gccctcggcg 


acgccggcgt 


cgacgcaccg 


ctgtggtgcg 


35 


33201 


tcacccgcag 


cgccgtctcc 


gccggccgca 


ccgaccggcc 


gcaccgaccg 




33251 


ctccaaggcg 


ccgtctgggg 


cctgggccgg 


gtcgcggccc 


ttgagtaccc 




33301 


gcagcgctgg 


ggcggcctgg 


tggacctgcc 


ggaggagccc 


gacgagcggt 
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33351 


ccgcggccgg 


cctcgccgcc 


gtcctggccg 


gtctggacgg 


cgaggaccag 




33401 


gtcgccgtgc 


gcggcaccgc 


ggtgctcgcc 


cgccgcctgg 


tgccggctcc 




33451 


cggccgcaag 


ccgtcccggc 


cctggcaccc 


gtccggcacc 


gtcctggtca 




33501 


ccggcggcac 


cggcgccctc 


ggcgcgcacg 


tcgcccgccg 


cctggccaag 


5 


33551 


gacggcgccc 


agcacctcgt 


cctgctcagc 


cgccgcggcc 


cggacgctcc 




33601 


cggtgcggcg 


gaactgcgcg 


cggaactgga 


cgcgttgggc 


accgacgtca 




33651 


cggtcgccgc 


ctgcgacgtc 


gccgaccgcg 


accagctgac 


ggccgtcctg 




33701 


gacgcgctgc 


ccgccgaccg 


gccgctgacc 


ggtgtggtgc 


acaccgccgg 




33751 


cgtcctcgac 


gacggcgtac 


tggaccggct 


cacccccgag 


cggttccagg 


10 


33801 


aggtgttccg 


cgccaaggtc 


acctcggccc 


tgctgctgga 


cgagctgacc 




33851 


cgcgaccgcg 


agctggccgc 


gttcgtcctc 


ttctcctccg 


cctccgccgc 




33901 


ggtcggcaac 


ccgggccagg 


ccaactacgc 


cgctgccaac 


gccgtcctgg 




33951 


acgcgctcgc 


cgaacagcgc 


cgggtgctcg 


gcctgcccgc 


cacctcggtc 




34001 


tcctggggtg 


cctggggagg 


cggcggcatg 


gccgacgccg 


acggcgcgga 


15 


34051 


cgaggccgcc 


cggcgcgccg 


gcgtcggcgc 


catggacccg 


cacctcgccg 




34101 


tggaagccct 


gctgcgcctg 


gtcgccgaga 


aggagccgac 


cgcggtggtc 




34151 


gccgaggtgg 


ccctggaccg 


gttcgccggc 


gccttcggcg 


gcagccgacc 




34201 


cagcgccctg 


ctgcgggagt 


tccccggcta 


ccgcgaggcg 


ctcgccgccc 




34251 


aggcggagca 


ggccgcggac 


ggcggcgggc 


tggccgcccg 


actggccgcg 


20 


34301 


ctgccgcccg 


cccgccgcct 


ggacaccgtt 


gtggacctgg 


tgcgcacccg 




34351 


cgccgcgcag 


gtgctcggct 


accccgacac 


cgaagcggtc 


gccgccgaac 




34401 


ggtccttccg 


cgacctgggt 


gtcgactcgc 


tcggcgccgt 


cgagctgcgc 




34451 


aaccaactga 


gcgcggccac 


cggcctgaac 


ctgccggcga 


cgctggtgtt 




34501 


cgaccacccg 


acccccctgg 


tcctggggga 


gcacatcctc 


ggcgggctct 


25 


34551 


tcccggacga 


gcccgccggg 


tccgacgacg 


agacggagat 


ccgggccctg 




34601 


ctggcctccg 


tcccgctcga 


ccaactgcgg 


gagatcgggg 


tcctggagcc 




34651 


cctgctccag 


ctcgccggac 


gcggcggccg 


ggccgcggac 


ggcgacgacg 




34701 


gcgagtccgt 


cgactcgatg 


acagtggcag 


acctggtgcg 


ggccgcgctc 




34751 


aacggccagt 


ccgacctgta 


gcgcgattga 


tggagcagac 


gatgaacgcg 


30 


34801 


cccgagaacc 


ccgagacccc 


cgagaacaac 


gtagtcgccg 


cactccgcgc 




34851 


cgcggtcaag 


gagaccgacc 


ggctccggcg 


gcagaaccgg 


atgctggtcg 




34901 


cggcggccaa 


ggaaccgatc 


gccgtggtcg 


gcatggcctg 


ccgcttcccc 




34951 


ggcgccgtcg 


actccccgga 


agcgctgtgg 


gagatggtcg 


ccaccggcac 




35001 


cgacgtgatc 


tccggattcc 


ccgacgaccg 


cggctgggac 


ctggaggcgc 


35 


35051 


tgcgcaacag 


cggcaccgac 


gcccgcgaca 


ccgacgtcag 


ccagcgcggc 




35101 


ggattcctgg 


actgcatcgc 


cgacttcgac 


cccggcttct 


tcgggatctc 




35151 


accgcgcgag 


gcggtcacca 


tggacccgca 


acagcggctc 


ctgctgacca 
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35201 ccgcctggga ggccgtcgag 

35251 gccacccgca ccggcgcgtt 

35301 cctgctcgtc cgctccctgg 

35351 tcgccgccag cgccgcctcc 

5 35401 ggccccgcgc tcaccgtcga 

35451 gcacctggcc gtgcaggcgc 

35501 ccggcggcgt caacgtgatg 

35551 cgccagggcg ggctggcccg 

35601 cgccgacggc accggctggt 

10 35651 ggctctccga cgcccagcgc 

35701 ggctccgccg tcaaccagga 

35751 cggcccctcc cagcagcgcg 

35801 tggccaccgg cgacatcgac 

35851 ctcggcgacc ccatcgaggc 

15 35901 ccgcgcccac ccggtgctgc 

35951 cccaggccgc gtccggcgtc 

36001 cggcacggcg tcctcccgcg 

36051 cgtcgactgg accaccggca 

36101 ggcccgagac tggcaggccg 

20 36151 agcggcacca acgcccacgt 

36201 cgaggcggct gacgacactc 

36251 tgctctccgc ccgcaccggc 

36301 ctcgaccacc tcgaccgccc 

36351 caccgcgttc tccctcgcca 

25 36401 ccgtcgtcac cggcaccgac 

36451 tggctggcgc acggcaccgc 

36501 acgcacccgc tgcgcggccc 

36551 gcatgggccg cgaactccac 

36601 gacaccgccg tggacctgct 

3 0 36651 ggtgatctgg ggcaccgacg 

36701 agcccgccct gttcgccgtc 

36751 tggggcgtcg ccccggactt 

36801 cgccgcgcac gtcgccgggg 

36851 tggccgcccg cgccgggctg 

35 36901 gtcgccgtcg aggccaccga 

36951 cgtcgcgatc gccgcgatca 

37001 acgagaccgc caccctcgcc 



cgggccggca tcgacgccac cacgctgcgc 
catcggcacc aacggccagg actacgccta 
acgacgccac cggcgacgtc ggcaccggca 
ggtcggctct cctacaccct cggcctcgaa 
caccgcctgc tcctcgtcgc tggtcgccct 
tgcgcaacgg cgagtgcggc atggcgctgg 
gccacaccgg gctcgctggt cgagttcagc 
ggacggccgc tgcaaggcgt tcgcggacgc 
ccgagggcgc cggcgtgctg ctgctggaac 
aacggccacc cggtgctcgc cgtggtccgc 
cggcgcctcc aacggcttca ccgcccccaa 
tcatccgcca ggccctcgcc aacgccggcc 
gcggtcgagg cgcacgggac cggcaccccg 
gcagagcatc ctcgccacct acggccagga 
tcggctcgat caagtcgaac atgggccaca 
gccggcgtga tcaagatgat catggcgatg 
gaccctgcac gtcgaccggc cctccaccca 
gcgtcgaact cctcaccgac gcccacccgt 
cgccgcaccg gcatctcctc cttcggcgtc 
catcgtcgaa caggcccccg acacccccgc 
cgccccgcac cccgcggacc ctgccgtggc 
gccgccctgc gcgaccaggc caccgcgctg 
cgacggcgac cgcgggccca ccgccctgga 
ccacccgcgc cgccctggaa caccggctcg 
ggcaccgccg gacgggacgc cctgaccgcc 
ccccgacgcc cacgaaggac acgccgccgg 
tcttctccgg ccagggcgcc cagcgcctgg 
gcccgtttcc cggtgttcgc acgggccctc 
cgacgccgaa ctgggcggca ccctgcggga 
acgcgccgct caacgagacc ggcttcaccc 
gaggtcgccc tctaccgcct gatcgaatcc 
cgtcgccggc cactccatcg gcgagatcgc 
tgttctccct ggaggacgcc tgcacgctgg 
atgcaggcgc tgccgcgcgg cggggcgatg 
ggacgaggtc agcccgctgc tcaccgacgg 
acggccccac ctcgctcgtc gtctccggcg 
gtcgccgccc gactcgccga acagggccgc 
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37051 cgcaccaccc ggctgcgggt cagccacgcc ttccactcgc cgctgatgga 

37101 cccgatgctc gcggagttcc gcgcggtcgc cgagggcctg tcctacggcg 

37151 aaccgcagat cccggtggtc tccaacctca ccggcgcggt cgccgacggc 

37201 accctgctcg gcactgccga ctactgggtc cggcacgtcc gcgaggcggt 

5 37251 ccgcttcgcc gacggcatcc gcgccctcac cgacgccggc gtcggcgcct 

37301 tcctcgaact cggcccggac ggcacgctcg ccgccctggc ccagcagtcc 

37351 gcccccgacg ccgtctccgt ccccgtcctg cgcaaggacc gggacgagga 

37401 gcccgccgcg gtcgccgcac tggcccggct gcacaccgcc ggcgtcccgg 

374 51 tggactggac ggcgttctac ^ccggcaccg gcgcccaccg caccgacctg 

10 37501 ccgacctacg ccttccagta cgagcgctac tggcccaagg ccacctaccg 

37551 gcccgccgac gccaccggcc tcggcctgac cgccgccgac cacccgctgc 

37601 tcggcgccgc catgtccgtc gccgggtccg acgagctcct gctcaccggc 

37651 accctgtcgc tcgccaccca cccctggctc gccgaccacg tcgtcggcgg 

37701 catggtcttc ttccccggca ccggcttcct ggaactggcg gtccgcgccg 

15 37751 ccgaccaggt cggctgcgac cgggtcgagg aactcatgct cgccgcgccg 

37801 ctgatcctgc ccgccaccgg caccgtccag atgcagatcg cggtcggcgc 

37851 cgcggacgac gacggcggcc gcgacctgcg cttcttcacc cggcccgggg 

37901 acgacccgga cgccgcctgg gcccagcacg ccaccggccg gatcaccgag 

37 951 ggcgagcgcg tcctcgccct cgacaccacc acctggccgc cccgcgacgc 

20 38001 ggaacccgtc gacatcgacg gcctctacga ccgctaccgc gccaacggac 

38051 tcgactacgg gcccgtcttc cgcggcctgc gcgccgtatg gcgccgcgac 

38101 accgagatct acgccgaggt cgccctgccc gaaggcaccg ccgacgccga 

38151 cgccttcggc ctgcacccgg ccctcttcga cgccgtcctg cacagcaccc 

38201 tcttcgcctc cgccgacggc gacgaccgca gcctcctgcc gttcgcctgg 

2 5 38251 aacggcgtgt ccctgcacgc cgcgggcgcg gacgcgctgc gcgtccggat 

38301 caccagctgc ggccccgacg ccgtggagat caccgccgtc gacccgcagg 

38351 gccgccccgt cgtctccgtc gaatcgctga cgctgcgcgc cgccggcccc 

38401 gacgccggca ccgccgacca ccgtgccgac gcgggctccc tcttccgcat 

38451 ggactggacg ccccgcaccg tccacgcccc ggccaccccc gccacctggg 

3 0 38501 ccgtcctcgg taccgacccg atcggcctga ccgaggcgct caccgccgcc 

38551 ggccccgaca ccgtcacggg actccgcgac ggtgtcgacg ccctcggcga 

38601 actcaccgcc ggcgacgacc ggccggtgcc cgacgtggtc gcggtaccgc 

38651 tgcgcggcgc caccgaccac gggccggccg gtgcccacga cctgacccgc 

38701 accgtcctgg ccctgctcca ggaatggctg gccgaggagc gcttcgcccg 

3 5 38751 ctcccggctg ctgctcgtca cccgcggcgc ggtcgccgac ggcgagcgcg 

38801 gcccgctcga cctggccgcc gccccggtct ggggcctggt gcgctccgcc 

38851 cagtccgaga accccggccg actgctgctc gtcgacctcg acgacaccgc 
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38901 


cgagtccgcc 


gcccaactcc 




38951 


agccgcaggc 


cgtggtccgc 




39001 


cgcctggact 


ccggccgcgg 




39051 


cctgggcagc 


cgggccaagg 


5 


39101 


accccgaggc 


ccgacgaccc 




39151 


cgcgccgcgg 


gcctgaactt 




39201 


tcccggggat 


gcggggctgt 




39251 


aggtcggacc 


ggaggtcacc 




39301 


atgctcttcg 


gcggcttcgg 


10 


39351 


cacccccgtc 


ccggccgact 




39401 


tggtgttcct 


caccgcgtac 




39451 


gcgggggaga 


aggtgctggt 




39501 


ggcgatccag 


atcgcccggc 




39551 


gtgagggcaa 


gtgggacgtg 


15 


39601 


atcgcctcct 


cccgcaccct 




39651 


cggcgaccgc 


ggcctggacg 




39701 


tcgacgcctc 


gatgcggctg 




39751 


ggcaagaccg 


acatccgcgc 




39801 


ccactccttc 


gacctcggca 


20 


39851 


tgctcgacct 


cgtcgagctg 




39901 


gtccgcagct 


gggacgtgcg 




39951 


cctggcccag 


cacatcggca 




40001 


accccgacgg 


caccgtgctc 




40051 


ctgctcgccc 


gccacctggt 


25 


40101 


ggccggccgg 


cgcggccccg 




40151 


aactgaccgc 


cctgggcgcc 




40201 


gaccgcaccg 


cgctcgccgc 




40251 


. cctcaccgcg 


gtcgtgcaca 




40301 


ccgccctgaa 


ccccgaccgc 


30 


40351 


gccgcctggc 


acctgcacga 




40401 


cgtgctctac 


tcctccaccg 




40451 


actacgcggc 


cggcaacacc 




40501 


gccctcggcc 


tgcccgccac 




40551 


cgccggcatg 


accggcgcac 


35 


40601 


acgccggcgg 


ccaaccgctg 




40651 


gacgccgcca 


ccgccgccga 




40701 


cggcggtgcg 


ctgcccgccg 



cgttgctgcc ggccctcctg gacgccgacg 
gagggcaccg tccgggtcgg ccggctcgcc 
cctcgtcccg ccgcccggca ccccctggcg 
gcagcctcga cggcctcgcc ctgctgcccc 
ctcaccggcc acgaggtccg cgtcggcatc 
ccgtgacgtg ctcaacgcgt tggggatgta 
tcggttcgga ggcggccggt gtggtcgtcg 
ggcctggcac ccggcgaccg ggtcatgggc 
accgctcggc atcgccgacg cccggctgct 
ggtcctggga gacgggtgcg tcggtgccgt 
tacgccctga aggagttggg tggtctgcgg 
gcatgccggt gccggtggtg tcggtatggc 
atgtcggtgc cgaggtgttc gccacggcca 
ctgcgctccc tgggcgtggc cgacgaccac 
cgacttcgag gcggccttcg ccgaagtcgc 
tcgtactgaa cgcgctgtcc ggcgagttcg 
ctcggcgacg gcggccggtt cctggagatg 
cgcggactcc gttcccgacg gcctctccta 
tggtcgatcc ggaacacatc cagcggatgc 
ttcgaccgcg gcgcgctggc cgcgttgccg 
ccgcgccggc gaggcgttcc gcttcatgag 
agatcgtgct caccgtgccg caacccctcg 
ctcaccggcg gcaccggcgg cctggccggc 
caccgagcac ggcgcccgcc acctgctgct 
acgcgcccgg cgccgccgca ctccacgccg 
gaggtcaccg tcgccgcctg cgacgtcgcc 
gctgctcgcc accgtgcccg ccgaacaccc 
ccgccggcgt cctggacgac ggcaccctca 
ctcgccaccg tcctacggcc caaggtggac 
cctcacccgc cacctcgacc tggccgcgtt 
ccggcgtcat gggcggaccg ggccaggcca 
ttcctcgacg cgctcgccgc ccaccgacac 
ctcgctggcc tggggcgcct gggagcaggg 
tgaccgacca cgacctgcgc cgggtcagcg 
ctcaccgccg aacgcggcct cgccctctac 
cgaacccctg atcgtcccgc tcggcctcac 
gggtcggcgt ccccgccgtg ctgcgcggcc 
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40751 tggtccgcac cgcgggccgc 

40801 cgcgccggcc tcgccgaacg 

40851 ccccttcctc gtcgagctgg 

40901 acggctccac cgacccggtg 

5 40951 ttcgactcgc tgaccgccat 

41001 cggcctcacc ctgcccgcca 

41051 gcctcgccgt ccacctccac 

41101 accgtcaccg ccgccgcaca 

41151 catcgtcggc atgagctgcc 

10 41201 agctgtggga cctggtggca 

41251 gccgaccgcg catgggaccg 

41301 ccgcaccggc cagggcggat 

41351 ccttcttcgg catctcgccg 

41401 cgcatcctcc tcgaagtcgc 

15 41451 cccgcagacc ctgcgcggca 

41501 gccaggacta cgccggcctc 

41551 cacgccacca ccggcctcgc 

41601 cgcgctcggc ctggagggcc 

41651 cctccctggt gtcgctgcac 

20 41701 tgcaccatgg ccctggccgg 

417 51 cttcaccggc ttctcccgga 

41801 aggcgttctc cgactccgcc 

41851 gtcctggtcc tggaacgcct 

41901 actggccgtg gtgcgcggct 

25 41951 gtctgacggc gcccaacggt 

4 2001 ctggccaacg cgggcctgac 

42051 cggcaccggc accccgctcg 

42101 ccgcctacgg caccgaccgc 

42151 gtgaagtcca acatcggcca 

30 42201 ggtcaagatg gtcatggcca 

42251 acctcaccga accgtcctcg 

42301 ctgctcaccg agcggaccgc 

42351 cggggtctcc tcgttcggca 

42401 aacagccgcc cgccgagccc 

35 42451 cccaccgtgg tggcctggcc 

42501 cgcccaactg gaccggttgc 

4 2551 ccgcccacac cctcgccacc 



cgggccaggg ccggcaccgc cggcgtctcc 
cctcgccgcc ctgcccgagg aggagcgcac 
tgcgcaccga ggccgccacc gtcctcggcc 
gacgcccgcc gcgagttccg ccaactcggc 
cgaactgcgc aaccgactcg gcaaggccac 
ccctcatctt cgactacccg acccccgacc 
gacgaactcc tcggcgcgga cgccccggtg 
ggccgcggac ccggagcacg acccggtcgt 
gcttccccgg cggcgtcagc tcccccgagg 
tccggcaccg acgcgatcac cggcttcccc 
ccacccgcag ctcgccggcg cccccggcgc 
tcctccgcga catcgccgac ttcgacgccg 
cgcgaggccc tggccatgga cccgcagcag 
ctgggaggcc gccgagcgcg ccggcatcga 
gcgacaccgg cgtgttcatg ggcgtcagcg 
gtgatgcgct cccgcgacga catcgccggc 
cgtcagcgtc gtctccggcc gcctcgccta 
cggccctgtc cgtggacacc gcctgctcct 
ctggccgccc aggcgctgcg cgcgggggag 
cggcgtcacc gtcatgacca ccgccgccaa 
tgggcggcct cgcccaggac ggccgctgca 
gacggcaccg gctggtccga gggcgccgcc 
ctccgacgcc cggcgcgccg gccaccgcgt 
cggcggtcaa ccaggacggt gcgtccaacg 
cccgcccagc agcgcgtcat ccggcaggcc 
ccccgtcgac gtggacgccg tcgaggcgca 
gcgaccccat cgaggcccag gccctgatcg 
gaccccgaac acccgctgct gctcggctcg 
cacccagtcc gcggccggcg cggccgggct 
tgcgccacgg catcctgccg cagaccctgc 
cacgtggact ggtcggcggg cacggtgcgg 
ctggccgcgg acggatcgtc cgcgtcgggc 
tcagcggcac caacgcccac gtcatcctgg 
acccccg.ccg ccgacccggg ccggcccgca 
cgtctccgcg cagaccccgg ccgccctcga 
gcaccgccgc cgccctggcg ccgctcgaca 
ggccgctcgc tcttcgaaca ccgcgccgtc 



SUBSTITUTE SHEET (RULE 26) 



WO 01/59126 PCT/GB01/00509 

97 



42601 ctgctcgcca ccgtcggcga 

42651 ggtcgccagg ggagcggcga 

42701 ggcagggtgc tcagcggtcg 

42751 ccggtgttcg cggcggcgtt 

5 42801 gctgggttct gatgctgatg 

42851 gcggggggtc ggagttgttg 

42901 ttcgcggtgg aggtggcgtt 

42951 gcctgagttc gtggcggggc 

43001 tggccggggt gttctcgttg 

10 43051 gcttcgttga tggatgcgtt 

4 3101 ggcggccgag gcggaggtgg 

43151 ccgcggtcaa cgggccggtc 

43201 gttgggcagg tcgtggatca 

43251 gttggcggtc agtcatgctt 

15 43301 atgccttccg ggccgtcgcc 

4 3351 cccgtggtgt ccaacgtgac 

43401 cgcggccgac tactgggtgc 

43451 acggcgtccg caccctggcc 

4 3501 ggccccgacg gcgtactgtc 

20 4 3551 ggcgctcgtg acgcccaccc 

43601 tgctcgccgg actggcccgg 

43651 agcgccgccc tgaccggcac 

43701 cgccttccaa cgcgagcggt 

4 37 51 gcggcggcgc ggatgccgcg 

25 43801 gcggacgcca ccgcgctcgc 

43851 cggcgccgtg ctgcccgcac 

43901 catcggccac caacgccctg 

43951 ctcgccggca cgccgcacac 
44001 cgcgaccacc gacccctggg 
30 44051 acgcccgccg ggtggacgtc 
44101 gccgccctgc tcaccgaagc 
44151 ctccctgctc gcgctcgacg 
44201 gcaccaccgc caccgccgcg 
44251 ccggccccgc tgtgggccct 
35 44301 cgagcagccg accgcccccg 
44351 tcgccgccct cgaactcccg 
44401 gccgacctcg acgagcgcac 



cccggcgacc ggcgcccccg acctgcccga 
cgccgcaccg caccgcgttc ctcttctcgg 
gggatggggc gtgaactgca tgctgctttc 
cgacgaggtg gtggctgtgt tggatgcgga 
ggggtgtgtc gctgcgggag gtgatgtggg 
gatcgaacgc gtttcacgca gccggcgttg 
gttccgtttg gtggcctcgt ggggggtggg 
attcggtggg tgagattgcg gcggcgcatg 
gtggatgcgt gtcgtttggt ggtggcgcgg 
gccggtgggt ggcgtgatgg ttgcggtgga 
tgccgctgtt ggtcgatggg gtggcgatcg 
tcggtggtgg tctccggtgt ggaggcggcc 
gttggtggag cggggccggc gggtccgtcg 
tccactcgcc gttgatggat ccgatgttgg 
gagggcctgg agtaccacca gccgcgcatc 
gggcgaggtg gccgcggcgg aggagctgtg 
ggcacgtccg ggcgaccgtg cggttcgccg 
gagcgcggcg ccaccgcctt cctggagatc 
cgcgctcgcc cgcggcgtcc tgcccgccga 
tccgcaagga ccgcgacgag gagagcgccc 
ctgcacgtcg ccggcgtgac cgtcgactgg 
cggcgcgcgc ggcaccgacc tgccgaccta 
actggccgga gttggccgcc gaacccgcgg 
gacgcggagt tctgggccgc ggtggagcgc 
cgcccacctg gacatcgacg gcgaccagct 
tgtccgcctg gcgcacccgg cgccgcacca 
cggcaccggg agagttggga accgctgtcg 
cggcggcgtc ctggtgctgg tgcccgccgc 
tcgccgacgt cgtcgccgcc ctcggcccgg 
ccggccgacg gcaccgaccg ggccgcgctc 
ggccgacgac accgccccga ccgccgtggt 
agaccagcgg cgacgacgcg gtaccggccg 
ctcgtccagg ccctcgccga caccggcgcc 
gacccggggc gcggtcgccg cgctccccga 
cccaggccgc cgtctggggc ctcggccgga 
cgccactggg gcggactggt cgacctgccc 
cgcacgccga ctgcccgccg cactggccga 
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44451 cgccggcgac gaggaccagc 

44501 gccggatcac cccggcgccc 

44551 cagccgaccg gcaccgtcct 

44601 gcacaccgcc cgctggctcg 

44 651 tcagccgcag cggccccgac 

44701 ctcaccgccc tcggcgcccg 

44751 ccgggagcag ctgaccaggg 

44801 tgaccggcgt ggtgcacacc 

44851 ggcctcaccc cggaccggtt 

44 901 cgccgtgctc ctcgacgagc 

44951 cgctgttctc gtccgtcgcg 

45001 tacgccgccg ccaacgcggt 

45051 ccagggcctg gccggcacct 

45101 gcatggcggc ccgtcacacc 

45151 gaccccgacc tcgccgtacc 

45201 gcccaccctc gtcctcgccg 

45251 tgctggcgct gcgccccagc 

45301 accgcggccc gcgcggtcca 

45351 ggccgacctg cgcgaccaac 

45401 ccgtcctcct ccgcctggtg 

45451 accggtgccg acgccatccg 

45501 cgactcgctc accgcggtgg 

45551 gcctggccct cccgcccagc 

45601 ctcgccgacc acctgcgcgc 

45651 ccccgcggcc ccgccggcac 

45701 tcgtcgtcgg catggcctgc 

45751 gagttctggc agctgctcgc 

45801 caccgaccgc ggctgggacc 

45851 cacagcgccc accgaggtcg 

45901 gaccccggct tcttcgacat 

45951 gcagcagcgg ctgctgctgg 

46001 gcaccgaccc gacccggctg 

46051 accaacggcc aggactacgc 

4 6101 cgaggggcac gccggcaccg 

46151 tcgcctacgc cttcggcttc 

46201 tgctcctcct cgctcgtcgc 

4 6251 gggggagtgc tccctggccc 



tcgcgctgcg ggccaccggc gcctacggcc 
gcccccgacg acgcccccgg caccgggtgg 
gatcaccggc ggcaccggcg cgctcggccg 
ccgcccacgg cgccgagcac ctgctgctgc 
gcgcccggcg cggccgaact caccaccgaa 
cgtcaccctc gtggcctgcg acgccgccga 
tcctcgccga ggtaccgcgg gactgcccgc 
gccggagtgc tcgacgacgg cgtgctcacc 
cgccacggtc ttccgcgcca aggtggcctc 
tgacccggga caccgacctg gccgtcttcg 
ggcgctgtcg gcaaccccgg ccaggccggc 
cctcgacgcc ctcgccgccc gccgccgggc 
cgatcgcctg gggtgcctgg gccggcgacg 
cgccccggcg ccgaacccgt cggcctgctc 
ggccctggcc cgcgcggtga cggagcccca 
acctccagca gccgcggctg ctggaatccc 
ccgctcctga gccggctgcc cgccgcccgc 
ggaagcggac cgccgccgag ccggcgccgc 
tcgccggcac cgcacccgcc gaccgccacg 
cggaccacgg ccgccgcggt cctcggccac 
ggccgacaag cccttccgcg acctcggctt 
aactgagcag cgctctcgcc gccgccaccg 
ctcgtcttcg accacccctc cccgcgggcg 
cgaactcacc ggcgaccggc cggaatccgc 
cggtccccgc cgcggacgac gatccgatcg 
cgcttccccg gcggcgtcac cacccccgag 
cgagggccgg gacggcatcg acgcgttccc 
tcgacgtgct cggccggcga cggccagggc 
gcggcttcct cttacgacgc ggccgccttc 
ctccccgcgc gaggcgctcg ccatggaccc 
agaccgcctg ggaggccgtc gaacgcaccg 
cgcggcagcc gcaccggcgt gttcgtcggc 
cggcctcgtc ctgcgcgccc aggaggacgt 
gactggccgc cagcgtgatc tccggccgcc 
gagggccccg ccgtcaccgt cgacaccgcc 
cctgcactgg gccgtccagg cgctgcgcgc 
tggccggcgg cgtcaccgtc atgacgacct 



SUBSTITUTE SHEET (RULE 26) 



WO 01/59126 PCT/GB01/00509 

99 





46301 


cgacgagctt 


cgccggcttc 


acccggcagg 


gcggcctggc 


gccggacggg 




46351 


cactgcaagg 


cgttctccga 


ctccgccgac 


ggcaccggct 


ggtccgaggg 




46401 


tgtgggcgtc 


ctcgtcgtcg 


aacgccgctc 


cgacgcgctc 


cgcaacggcc 




46451 


atgagatcct 


ggccgtggtg 


cgcggctcgg 


cggtcaacca 


ggacggtgcg 


5 


46501 


tccaacggtc 


tgacggcgcc 


caacggtccc 


gcccagcagc 


gcgtcatccg 




46551 


gcaggccctg 


gccaacgcgg 


gcctggcccc 


cggcgacgtg 


gacgcggtcg 




46601 


aggcccacgg 


caccggcacc 


gtcctcggcg 


accccatcga 


ggcccaggcg 




46651 


ctgctcgcca 


cctacggcca 


ggaccggccc 


gccgaccggc 


cgttgtggct 




46701 


cggctcggtg 


aagtccaaca 


tcggccacac 


ccaggccgcc 


gccggcgccg 


10 


46751 


ccggcctgat 


gaagatggtg 


ctggccctcc 


aacacggcac 


gctgccgcgc 




46801 


accctgcacg 


tcaccgagcc 


ctcgacccgg 


gtcgactggt 


cggccggcgc 




46851 


ggtgcggctg 


ctcaccgagc 


ggaccgtctg 


gccgcggacg 


gatcgtccgc 




46901 


gtcgggccgg 


ggtctcctcg 


ttcggcatca 


gcggcaccaa 


cgcccacgtc 




46951 


atcctggaac 


agccgcccgc 


cgagcccacc 


cccacggccc 


ctgccgaccg 


15 


47001 


ccccacccgg 


acgcccgccg 


tcctcccatg 


ggtcgtctcg 


gcccgatcgg 




47051 


ccaccgcgct 


cgacgcgcag 


ctcgcgcgac 


tgcgggcgtt 


cgccgccgag 




47101 


cgcccggacc 


tgccgcccgc 


cgacgtcgcc 


cactcgctcg 


tcaccagccg 




47151 


cgccaccttc 


gaacaccggg 


cggtcctgct 


ggccgcgccc 


gacggcatca 




47201 


ccgcggccgc 


ccgcgccgag 


gcccgcgaac 


gcagcaccgc 


gttcctcttc 


20 


47251 


tcggggcagg 


gtgctcagcg 


gtcggggatg 


gggcgtgaac 


tgcatgctgc 




47301 


tttcccggtg 


ttcgcggcgg 


cgttcgacga 


ggtggtggcg 


gtgttggatg 




47351 


cggagttggc 


gacgggttcc 


ggtgggggtg 


tgtcgctgcg 


ggaggtgatg 




47401 


tggggcgggg 


ggtcggagtt 


gttggatcgg 


acgcgtttca 


cgcagccggc 




47451 


gttgttcgcg 


gtggaggtgg 


cgttgttccg 


tttggtggcc 


tcgtgggggg 


25 


47501 


tggggcctga 


gttcgtggcg 


gggcattcgg 


tgggtgagat 


tgcggcggcg 




47551 


tatgtggccg 


gggtgttctc 


gttggtggat 


gcgtgtcgtt 


tggtggtggc 




47601 


gcgggcttcg 


ttgatggatg 


cgttgccggt 


gggtggcgtg 


atggttgcgg 




47651 


tggaggcggc 


cgaggcggag 


gtggtgccgc 


tgttggtcga 


tggggtggcg 




47701 


atcgccgcgg 


tcaacgggcc 


ggtttcggtg 


gtggtctccg 


gtgtggaggc 


30 


47751 


ggccgttggg 


caggtcgtgg 


atcagttggt 


ggagcggggc 


cggcgggtcc 




47801 


gtcggttggc 


ggtcagtcat 


gctttccact 


cgccgttgat 


ggatccgatg 




47851 


ttggatgcct 


tccgggccgt 


cgccgagggc 


ctggagtacc 


accagccgcg 




47901 


catccccgtg 


gtgtccaacg 


tgacgggcga 


ggtggccgcg 


gcggaggagc 




47951 


tgtgcgcggc 


cgactactgg 


gtgcggcacg 


tccgggcgac 


cgtgcggttc 


35 


48001 


gccgacggcg 


tccgcaccct 


ggccgagcgc 


ggcgccaccg 


ccttcctgga 




48051 


gatcggcccc 


gacggcgtac 


tgtccgccct 


ggccgcggcc 


tgcctgttcg 




48101 


acacggacgc 


cgaagtggtg 


cccgcgctgc 


gcaaggggcg 


ccccgaggag 
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48151 


cacaccgccc 


tcaccgccgc 




48201 


cgactggacc 


gcggtcctgg 




48251 


ccacctatgc 


cttccagcgc 




48301 


gcccccggcg 


acgccggcgg 


5 


48351 


gctcggggcc 


gcgaccaccg 




48401 


gccgcctgtc 


caccaccgcc 




48451 


ggccgcaccg 


tcctgccggc 




48501 


cggcgaccag 


gccgactgcc 




48551 


cgctcgtcct 


caccggcgcg 


10 


48601 


gcccccgacg 


acaccggccg 




48651 


cgactccccc 


gacagcccct 




48701 


acgacacccc 


gcagcccccg 




48751 


gccgtgccgc 


tcgacgccct 




48801 


ggcggcctgg 


cagtggggcg 


15 


48851 


aacccggccc 


ggcggagcgg 




48901 


accgcggtcc 


gcgccggcgg 




48951 


cctcggctgg 


cggggcctcg 




49001 


gggtccgcct 


caccccggac 




49051 


gacccgcagg 


gcgctccggt 


20 


49101 


gcccaccgtc 


gaccggtcgg 




49151 


tgctcgacct 


ggagtgggtg 




49201 


gaccacctcc 


cgtacgccgt 




49251 


gcagttgagg 


atcgcgggcg 




49301 


cgctgctgga 


cggcggtgcg 


25 


49351 


ctgggcgtgc 


cgaccgggga 




49401 


caccacggcg 


gtgctggagc 




49451 


ccgccgacag 


ccacctggtg 




49501 


gcggaggacg 


tgcacgacct 




49551 


ctcggcacag 


tccgaacacc 


30 


49601 


ccgccgatcc 


cgcgggagcc 




49651 


gccctgctcg 


acgcgggcga 




49701 


caccgtcgcc 


cggctgaccc 




49751 


gacacccggt 


gcgggactgg 




49801 


ggcaccggcg 


gcctgggcgg 


35 


49851 


cggcatcaag 


cacctgctgc 




49901 


gcgcgcgggc 


cctgcgcgac 




49951 


gtcgccgcct 


gcgacgtggc 



PCT/GB01/00509 

100 

cgcccaactc cacgtggccg gcgtggacat 
ccggcaccgg cgggcggcgg atcgccctgc 
gagcggtact ggccctcgct cgccgcacag 
gctcggcctg gaagccgggc ggcacccgct 
tcgccggatc cgcggagatc ctgctcaccg 
cagccgtggc tcgcggtcta cgaggcggac 
cgcggtcctc gccgaactcg ccgtccgcgc 
cgaccgtggc ggaactgacc gtcgccgcac 
gcggcccagc gcctccaggt ccgggtggcc 
gcgcgcgctg tccgtgcacg cccgacccga 
ggacgctgca cgccaccgcg gtcctcaccc 
gcgccggaca ccggctggcc gccggagcgc 
gcccaccgcc . accggcccgg cccggatcgc 
acgaactctg cgccgagatc gaactccccg 
gcattcgccc tgcacccggc gctgctggac 
cctgctggac ggcgacgcca ccctggacgc 
ccctgcacgc cgcgtccgcc accgccctgc 
ggcacggaca cctgggctct ggaggccacc 
cgtctccgtc accgggctca ccctgggcac 
gggccggggc ggccgatgac ggcgcgaccc 
cccgcgccgc aggccgcgcc caccggcggc 
gctcggcgat caactcgcgg agctggacgg 
acgggcccgg gcgcgtcgca tcgctggccg 
ccgctgcccc ggctcgtcct cgcgccggtg 
aggcgacctg cccgccgcgg tgcgcggcac 
tgctgcagcg ctggaccgcc gacgcccgca 
atcgtcaccc gcggcgccgt cgccgccggg 
ggcggcggcc ccggtctggg gcctggtccg 
ccggcagctt cctgctgctc gacctcgacc 
tcccgcgccg ccgcgccggc caccctggcg 
gacccaggcc gcggtgcgcg ccgacacgct 
gggccgccga cggacccgag gccaccgccg 
gaccgcgacg gcaccgtcct gatcaccggc 
cctcctggcc cgccacctgg tcaccggaca 
tcgccgggcg ccgcggcccg gacgcccccg 
gaactggccg ccctcggcgc cgaggtgacc 
cgaccgtgcc gcactggacc gactcctcgc 
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50001 


gcaactgccg 


ccggagcacc 


cgctgaccgc 


cgtcgtgcac 


accgccggcg 




50051 


tcctcgacga 


cgccaccgtc 


ggcaccctga 


cgcccgagcg 


gctggacacc 




50101 


gtcctgcgcg 


ccaaggcgga 


cgccgcctgg 


cacctgcacg 


acgccacccg 




50151 


cgaccgcgac 


ctggcagggt 


tcgtgctgta 


ctcctcggtc 


gccggtgtca 


5 


50201 


ccggcggccc 


cggccagggc 


aactacgccg 


ccggcaacac 


gttcctcgac 




50251 


gcgctcgccg 


cgcaccgcgc 


cgcccagggc 


ctgcccggac 


tgtcgctggc 




50301 


ctggggaccg 


tgggggcagg 


acgccggcat 


gaccggcacc 


ctcggcgccg 




50351 


ccgacctggc 


ccgcctggag 


cgctccggca 


tgccgccgct 


caccccggaa 




50401 


cagggcctgg 


ccctgttcga 


cgccgccggc 


gcccgcggcg 


acgggttcgc 


10 


50451 


ggtggcggtg 


cggctcgccc 


gtggcgccgc 


cgcaccgggc 


gccgacgagg 




50501 


tccccgcggt 


gctgcgtgcc 


ctggtgcgcg 


gccggcgccg 


cacggcggcc 




50551 


gcggccgggc 


acgccggtgt 


actggcccgc 


cggctggccg 


ccctggacgc 




50601 


cgagcagcgg 


catcaggcgc 


tgctcgacct 


ggtccgcacc 


gagacggccg 




50651 


cggtgctcgg 


ccactccggg 


gcggacgccg 


tcccggccga 


gcgggacttc 


15 


5O701 


aaccggctgg 


gcttcgactc 


gctgatggcg 


gtcgaactgc 


ggacgcggct 




50751 


ggccaccgcc 


accggagccc 


ggctgccggc 


cacgctcgtc 


ttcgaccacc 




5O801 


cgacgccgga 


cgcggtcgcc 


cggcacctcg 


cgtcgacgct 


gcccggtggg 




50851 


accgcggccg 


gtccggaccg 


ttccccgctg 


gccgaactcg 


accggatcgc 




50901 


cgccgagttg 


tcgccggagg 


gcgcggacga 


cgccacccga 


cagggcgtcg 


20 


50951 


tcgggcggct 


gcggcacctg 


ctggcgcagt 


gggacggcac 


ccgacaggac 




51001 


ggcggtggga 


cgaccgtcga 


cgaccgcatc 


gaagcggcga 


gcgccgaaga 




51051 


ggtcctcgcc 


ttcatcgacc 


acgagctcgg 


ccggcaggcg 


gactcctgac 




51101 


ccgccccact 


cccgtcgctc 


gcgcgcacca 


catctgagga 


aggtttcacg 




51151 


gaccatgccg 


gacgaaaaga 


agctcgtcga 


ctatctgaag 


tgggtcacga 


25 


51201 


aggacctcca 


ccagacccgc 


cagcgccttc 


aggaggtgga 


ggcggggcgc 




51251 


cacgaacccg 


tggcgatcgt 


cggcatggcc 


tgccgcttcc 


ccggcggtgt 




51301 


gcgctccccg 


gaggacctgt 


gggagctgct 


gtccgcgggc 


cgggacggca 




51351 


tcgggccgtt 


ccccgccgac 


cgcggctggg 


acctggcggc 


gctggccggc 




51401 


gacgggcccg 


gtcgcagcgc 


cacccaggaa 


ggcgggttcc 


tgcccgacgc 


30 


51451 


ggccgccttc 


gacccgggct 


tcttcgacat 


ctccccgcgc 


gaggcgctcg 




51501 


ccatggaccc 


gcagcagcgg 


ctgctgctgg 


agaccgcctg 


ggaggccgtc 




51551 


gaacgctccg 


gcatcgaccc 


ggccgggctg 


cgcggcagcc 


gcaccggcgt 




51601 


tttcgtcggc 


accaacggcc 


aggactacgc 


gcacctggtc 


ctcgccgcgc 




51651 


aggacgacat 


gggcggctac 


gcgggcaacg 


gcctggccgc 


cagcgtgctc 


35 


51701 


tccggccgac 


tggccttcgc 


gctcggcctg 


gaaggcccgg 


ccgtcaccct 




51751 


cgacaccgcc 


tgctcctcgt 


cactggtgac 


cctgcacctg 


gccgcacagg 




51801 


ccgtgcgcgc 


cggcgaatgc 


ggcctcgccc 


tggccggtgg 


cgtcacggtc 
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51851 


atgacgacct 


cgtcgagctt 


cgccggcttc 


agcctccagg 


gcggcctggc 




51901 


gccggacggc 


cgctgcaagg 


cgttcgccga 


ggcggccgac 


ggcaccggct 




51951 


ggtccgaggg 


catcggcctg 


cttctcgtcg 


agcggctctc 


cgacgcgcag 




52001 


cgcaacggcc 


acccggtgct 


cgccgtgctg 


cgcggctccg 


ccgtcaacca 


5 


52051 


ggacggcgcg 


tccaacggcc 


tcagcgcgcc 


caacggtccg 


tcccagcagc 




52101 


gggtcatccg 


ccaggcgctg 


gccggcgccg 


gactcgtccc 


cggcgacgtg 




52151 


gacgcggtcg 


aggcgcacgg 


caccggcacc 


cggctcggcg 


accccatcga 




52201 


ggccggtgcg 


ctgctcgcca 


cctacggcca 


ggaccggccc 


gccgaccggc 




52251 


cgttgtggct 


cggctcggtg 


aagtccaacc 


tcggccacac 


ccaggccgcc 


10 


52301 


gcgggcgtcg 


ccggcgtcat 


caagatggtg 


ctggccctgc 


ggcatggcgt 




52351 


cctcccgcag 


accctgcacg 


tggacgcgcc 


ctcctcgcac 


gtcgactggg 




52401 


agagcggcgc 


ggtgcggctg 


ctcaccgcac 


ccgtcgcctg 


gtccgagggc 




52451 


gacgaccggg 


tgcgccgggc 


cggcgtctcg 


tcgttcggca 


tcagcggcac 




52501 


caacgcccac 


gtcatcctcg 


aacaagcccc 


cgatcagccg 


gaaccgaccg 


15 


52551 


cggaagagac 


ggctgccgcg 


gcgcccggcg 


gcaccgccga 


ggagcgggcc 




52601 


gccgctcccg 


tcgccccgcg 


cgccgtgccg 


tggccggtcg 


cggcacgcac 




52651 


cgccggcgcc 


ctcgacgccc 


aactggtccg 


ggtccgcgcg 


ctgaccaccg 




52701 


cgcccggccg 


caccgccgcc 


gacgtcggtc 


acgcgctggc 


caccgcccgt 




52751 


acccccttcg 


agcaccgggc 


gctgctggtc 


cacgagggcg 


gcgccgtcac 


20 


52801 


cgaggtggcg 


cgcggcgccg 


tccccaccgg 


tgaccggggc 


gggctggccg 




52851 


tgctgttctc 


cggacagggc 


tcccaacggc 


cgggcatggg 


gcgcgaactc 




52901 


cacgcccgct 


acccggtctt 


cgccgccgcc 


ttcgacgaga 


ccgtcgccct 




52951 


gctcgacgcc 


cggctcggca 


cgtcgctgcg 


cgacatcgtc 


tgggaccagg 




53001 


accgcacccg 


gctcgacgac 


acccgccaca 


cccagcccgc 


gctgttcgcc 


25 


53051 


gtcgaggtcg 


cgctgtaccg 


cctgctggcc 


tcctggggca 


tccggcccga 




53101 


ccacgtcacc 


ggacactcca 


tcggcgagat 


caccgcggcg 


cacgtcgccg 




53151 


gtgtgctgac 


cctcgcggac 


gcctgcaccc 


tggtggccgc 


ccgcgccacc 




53201 


gccatgagcg 


aactgccgcc 


cggcggcgcc 


atggtggcgc 


tggaggccac 




53251 


cgaggacgag 


gtgcgtccgc 


tgctcaccga 


cgacctcgcg 


atcgccgcgg 


30 


53301 


tcaacgcccc 


ccggtccgtg 


gtcgtcgccg 


gcgccgagga 


cgccgccctc 




53351 


gccgtccgcc 


ggcacttcga 


cgacctgggc 


cgccggacca 


cccggctccc 




53401 


ggtcagccac 


gccttccact 


cgccgctcat 


ggacccgatg 


ctcgacgcct 




53451 


tccggacggc 


cctcgccccg 


ctgaccttcg 


ccgagccgga 


gatcccggtc 




535.01 


gtctccaacc 


tcaccggcct 


cccggccacc 


gccgaggaac 


tcgccacccc 


35 


53551 


gcactactgg 


gtgtgccacg 


tccggcaggc 


cgtccgcttc 


ggcgacggcg 




53601 


tgcgcgccct 


cgccgaccgc 


ggcgtgcgga 


ccttcctcga 


actcggcccg 




53651 


gacggcgtgc 


tgtccgccct 


ggtccgggag 


aacctccccg 


agccgggcct 
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53701 


ggtcgccgtg 


cccgtgctgc 




53751 


tggccgccct 


gggaaccctg 




53801 


gcggtgttcg 


ccggcacccg 




53851 


gacgtacgcc 


ttccaacgcg 


5 


53901 


acggcgaccc 


ggccgacctc 




53951 


ggcgccgccg 


tcaccctcgc 




54001 


cctcgcgctg 


ccctcccacc 




54051 


ggatcaccgt 


ccccggcgtc 




54101 


gacctgagcg 


gcaccccgca 


10 


54151 


caccctcggc 


gacggcgaca 




' 54201 


ccgaccccgc 


gggccaccgg 




54251 


accgaggacg 


ccccctggac 




54301 


cgcccccgaa 


gcgcccgcgg 




54351 


cgccgcggga 


cgcccgcccg 


15 


54401 


accgccgcag 


gccgccacta 




54451 


ctggcggcgc 


gacggcgagg 




54501 


ccgccgccga 


ccgcgccttc 




54551 


ctccgcgcca 


ccgccgcact 




54601 


cgaaccgacc 


ggcatcaccg 


20 


54651 


cactgcgggt 


ccggctcacc 




54701 


gccgcggacg 


ccacgggcgg 




54751 


cggctccccg 


caggaccgcc 




54801 


agggcggcct 


gttccacctc 




54851 


gccaccggca 


cccgctgggc 


25 


54901 


ctacgccctg 


caccgcgccg 




54951 


tgggcggagc 


catcggcgac 




55001 


cccgtcgtcg 


gcggcccgga 




55051 


cgcccgcgcc 


ctggggctgc 




55101 


ccggcgcccg 


cctggtcttc 


30 


55151 


gagaccgtca 


ccgacccggc 




55201 


cgcccagacc 


gagaacccgg 




55251 


cgttccggtc 


cgccgggatg 




55301 


cagctcgtcg 


tccgcgacca 




55351 


gccggagccg 


gccgccggca 


35 


55401 


gcaccgtcct 


gatcaccggc 




55451 


cgccacctgg 


tcaccgtccg 




55501 


ccgcggcccc 


gaggcgccgg 



gcaaggagcg gcccgaggag accaccgtgc 
tgggcgcacg gcgcggacgt ggactgggac 
caccccgcag gccgaccccg tcgagctgcc 
cccgctactg gcccaccctc ggcgcccgcc 
gggcagaccg ccgccgccca cccgctgctg 
cgacgccgac gagaccgtgc tcaccggccg 
cctggctcgg cgaccaccgc agcgacggcc 
gccttcgccg aactcgccgt ccgcgccggc 
cctggcgcgg ctcgacctgc cggcgccgct 
ccgtcaccct ccaggtccgg gtcggcgccc 
ccgctgaccg tccacgcccg cctcgcagcc 
cacctgcgcg accggtctgc tcgccccgga 
atccgatcgg cccggccgac gccgggtggc 
gtgcccgtcg ccgacctcga cgcggccgcc 
cggcccccat ttccagggcc tgaccgggct 
tcttcgccga ggtggccctg cccaccgcca 
ggcatccacc ccgcgctgct ggccaccgcg 
ggacgacgac cacaccgccg gccacacccc 
gactcgccct gcacgccacc ggggccaccg 
gcgaccgggc ccgacaccgt ggccctcgcc 
cgcggtcctg accgccgaca ccgtcaccct 
cggctcccgc accggccggc cacaccgggc 
gactgggtgc cggtcgaccc cggcagccga 
cgtcgtcggc gacgacgaac tcgacctcgg 
acgagacggt cagtgcctac gcggcgtcgc 
agcggtctgg cgcccgacgt cttcctcgtc 
cgccgggccc gacgcggtgc acgccgtcac 
tccaggagtg gctgaacgag ccgcggttgg 
gtcacccgcg gcgccgtcgc ggtgcccggc 
cggcgccgcc gtctggggcc tgctgcgctc 
gcagtctgct gctggtcgac ctcgacgacg 
ctgccgcacg tcctcaccct cgacgaacag 
cgcggtccgc gccgcccgcc tggcccggct 
ccgcgccggc ccgcgcctgg gacccggacg 
ggcaccggcg gcctgggcgc cgcgctcgcc 
cggcgcccgc cacctgctgc tcgccggccg 
gcgccggcga actggtggcg gagctgaccg 
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55551 cacagggcgc ggacgtgcgg 

55601 gccctcgacg cgctcctcgc 

55651 cgtcgtgcac accgccggcg 

55701 cccccgacca actggccacc 

5 55751 catctgcacg acgccacccg 

55801 ctcctcggtc tccggcgtcc 

55851 ccgccaacgc ctacctcgac 

55901 ctcccggcgc tctccctcgc 

55951 gaccgcgtcg gtcagcgacg 

10 56001 tgccgccgct gaccgtcgag 

56051 ggccgccccg agccggccct 

56101 gcgggaccag caggcactgc 

56151 cccgccgcac cgcggccacc 

56201 cgcctccgcc acctcgacga 

15 56251 cgtcgtcggc tacaccgccg 

56301 tcgatcccga acggggcttc 

56351 gtcggcctgc gcaaccagct 

56401 gtccatcgtc ttcgacagca 

56451 accaggaact cgccaacggc 

2 0 56501 gcggacgccc gtcccgcggt 

56551 ctacaacgcg gtgcgcggcg 

56601 aggccgtcgc caacacccgg 

56651 gagctctccg agccggtgac 

56701 gatcttcgtc agcgccccgg 
2 5 56751 gcatcgccgc gcacttccgc 
56801 atgggcttcg cccccggcga 
56851 ccgtatcgtc gccgagagcg 
56901 tcatggtcgg ccactccacc 
56951 gtcctggagg acacctggga 
30 57001 caccgcgtcc atccgctaca 
57051 ccacgaggtt ctacctggcc 
57101 agcgcccgga tgtccgccat 
57151 ccaggcgccc gcaccgaccg 
57201 ccctcgacgg cttccggctc 
35 57251 cgggacatcg acgccgacca 
57301 gaccgcgcag gccatcgagg 
57351 cctgatctcc ggcccggccg 



gtggccgcct gcgacgtcgg cgaccgcacc 
cacggtcccc gcggcgcacc cgctgaccgc 
tcctggacga cgccctgatc ggctcgctca 
gtgctacggc ccaaggccga cgccgcctgg 
cggcctcgac ctggccggct tcgtcctgta 
tgggcagccc cggccagggc aactacgccg 
gcgctcgccc ggcaccgcgc cgaccagggc 
ctggggcccc tggggtcggg gcagcggcat 
ccgacctgga gcggatggcg cgcggcggcc 
gacggcctgg ccctgttcga cgccgccgtc 
ggtgcccagc cgcatcaacg tcgccggcct 
cggcactctg gcgcgacctg gtaccgcggg 
gccgaccgct ccccggtcac ggtgcgcgag 
gaccggccag gagcagctgc tcatcgacct 
gcctgctcgg ccaccccgac cccaccgccg 
ctggagctgg gcttcgactc cctggtctcg 
cgccgagatc ctcggcctgc gcctgccctc 
agtcgccggt gaagctggcg cgttggctgc 
ccccagccgg gcgccaccgg ccccgccgcc 
gcgctccgac gacaccctgg agggcctgtt 
gcaagctcgt cgaggcgatg cggatgctca 
ccgatgttcg acacccccgc cgagctggag 
gctcgccgac ggcccgggcc ggccccggct 
gcgccaccgg cggcgtccac cagtacgcgc 
ggcagccgcc atgtctccgc gctgcccctg 
gctcctcccg gccaccagcg aggccgcggc 
tcctgatggc cagcgagggc gaaccgttcg 
ggcggctcgc tggcctacct cgccgccggc 
cgtccggccc gaagcggtgg tcctcctcga 
accccggcga gggcaacgac ctggaccgca 
gacatcgact cgccctcggt gacgctcaac 
ggcccactgg ttcatggcga tgaccgacat 
cccccaccct cctcgtgcgc gccgcccggg 
gacacctcgt ccgtccccgc cgacgaggtc 
cctctccctc gccaaggagc actcggcact 
gatggctcgc ggaactgccg gaccccgcgg 
gcccccgaca ggccgggccc atccccccac 
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57401 


cggggcggcg 


ccacccaggt 


gcgccccggc 


gcacccggcc 


ccgcgcaggc 




57451 


agccggagcg 


cccatccccc 


cgcatggcac 


caccctccag 


aggagtcctt 




57501 


ccatgagcac 


accgaccgca 


ccgccctccc 


tgaaagcgga 


ggtgccgccc 




57551 


gtcctgcgcc 


tgagcccgct 


gctgcgcgaa 


ctccagtccc 


gcgcccccgt 


5 


57601 


ctgcaaggtc 


cgcacccccg 


ccggcgacga 


gggctggctg 


gtgacccggc 




57651 


acaccgaact 


caagcagctg 


ctgcacgacg 


accggctggc 


ccgcgcccac 




57701 


gccgacccgg 


ccaacgcccc 


gcgttatgtg 


cacaacccgt 


tcctggacct 




57751 


gctcgtcgtc 


gacgacttcg 


acctggcccg 


cacgctgcac 


gccgagatgc 




57801 


gctccttgtt 


caccccgcag 


ttctcggccc 


gccgcgtcat 


ggacctgacg 


10 


57851 


ccgagggtgg 


aagccctcgc 


cgagggggta 


ctggcccact 


tcgtcgccca 




57901 


gggaccgccc 


gccgacctgc 


acaacgactt 


ctcgctgccg 


ttctccctgt 




57951 


cggtgctgtg 


cgcgctcatc 


ggcgtcccgg 


ccgaggaaca 


ggggaagctg 




58001 


atcgccgccc 


tcaccaaact 


gggcgaactc 


gacgacccgg 


cacgcgtcca 




58051 


ggaaggccag 


gacgagctgt 


tcggcctgct 


gtccggcctg 


gcacgccgca 


15 


58101 


agcgcatcac 


acccgaggac 


gacgtcatct 


cccggctctg 


cctgaaggtg 




58151 


ccctccgacg 


agcgcatcgg 


cccgatcgcc 


tccggtctgc 


tcttcgccgg 




58201 


cctggacagc 


gtcgccagcc 


acatcgacct 


gggcacggtg 


ctgttcatcc 




58251 


agcacccgga 


ccagctcgcc 


gcggccctgg 


ccgacgagaa 


gctgatgcgc 




58301 


ggcgccgtcg 


aggagatcct 


gcggtccgcc 


aaggccggcg 


gttcggtgct 


20 


58351 


cccgcggtac 


gcgaccgccg 


atgtaccgat 


cggcgacgtg 


accatcaggg 




58401 


ccggcgacct 


ggtgctgctg 


gacttcaccc 


tggtgaactt 


cgaccgcacg 




58451 


gtcttcgacg 


agccggagct 


cttcgacatc 


cggcgcgccc 


ccaacccgca 




58501 


cctgacgttc 


ggccacggca 


tgtggcactg 


catcggcgcg 


ccgctggccc 




58551 


gggtcaatct 


gcgcaccgcc 


tacaccctgc 


tgttcacccg 


cctgcccggc 


25 


58601 


ctgcggctgg 


tgcgcccggt 


cgaggaactg 


cgggtgctgt 


cggggcagtt 




58651 


gtcggccggc 


ctgacggagc 


tgcccgtcac 


ctggtgacgt 


gatgtcggac 




58701 


ccgcccggcc 


cggttccggg 


caggcggaag 


tgagggcccc 


cgccccgctc 




58751 


ggccgggggc 


cctcacgcac 


gggggagcgg 


gctcctcagt 


ccgccgcgac 




58801 


gctgatcgcc 


ccggacgggc 


agagcgccgc 


cgcgtcgcgt 


acgtcgccgg 


30 


58851 


gatccgcggc 


gtccgccgcc 


ccggccagga 


cggtcaccag 


accgtcgtcg 




58901 


tcctggtcga 


acaggtccgg 


tgcggtcagg 


acgcactggc 


cggccccgac 




58951 


acagcggccg 


gggtccacgg 


tgatgcgcac 


gatcgttcct 


ccgagggtcg 




59001 


gttcgtcgcc 


ggggtcgggc 


cgccgccggg 


cggcccgacc 


gggctcactt 




59051 


ccaggtgacg 


ggcagttcgt 


gcaggccgaa 


cagcaccccg 


tcgtacttca 


35 


59101 


acggcagttc 


ctccaccggc 


acggcgagcg 


tgagggaggg 


gatccgggcg 




59151 


aagagcgtgc 


ggtaggcgat 


gtccatctcc 


tcccgcacca 


ggttctggcc 




59201 


caggcactgg 


tggacaccgt 


agccgaacgc 


cacgtgcgag 


cgcgccgagc 
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59251 


gctcggggtc 


gaactccgag 




59301 


gcggcggcga 


tcagcggcac 




59351 


gccgatctcc 


acgtcctcga 




59401 


ccgagtagta 


gcgcagcagt 


5 


59451 


cgcgggttgg 


tgagcaactg 




59501 


cgtctcgtgc 


ccggcgatca 




59551 


gccgggacag 


tgtgcccgcg 




59601 


ggtcgccgct 


gcttgatctg 




59651 


ctgggtcgcc 


ttgtcgcgct 


10 


59701 


cccgggtccg 


gtcctcgaag 




59751 


agactggaga 


tcaccaggga 




59801 


gtccgccgag 


gtgcccgcgg 




59851 


tctgctggat 


caccggccgc 




59901 


gggatcaagg 


tcttgcggaa 


15 


59951 


gaaccacccc 


ggcacctcgt 




60001 


ccttggggaa 


accgggtttg 




60051 


agcaccgccc 


ggacgtcctc 




60101 


ggggagttcg 


gagcggacca 




60151 


ccggcggggg 


gaagggccgg 


20 


60201 


cactgcggcg 


cggcggtccg 




60251 


tagaacgcgc 


ggatccggcc 




60301 


tccggtgtgc 


gtcggcagat 




60351 


tcaacgcggg 


ccagtcggcg 




60401 


ttgaagaaga 


cccgggtctc 


25 


60451 


ctcgtcccgc 


cgctcggccc 




60501 


gcggcatcag 


cgtgatgccg 




60551 


ttctcgatgt 


cgcggcgcag 




60601 


cgcgagcgcc 


accgcggcct 




60651 


gcttcttgtg 


caggaagctg 


30 


60701 


tgggccatct 


gctcggccag 




60751 


ctcgccggcc 


gagatgatct 




60801 


cgccccgcgg 


ccgcaccccg 




60851 


aggttgtact 


cgtacgccag 




60901 


gccgtagatg 


tgcaccggca 


35 


60951 


cctcgatgcg 


ggacacgtcg 




61001 


accggcgtgg 


caccggtgta 




61051 


gaactccggg 


acgatcacct 
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ggcgccgcga aggccgtggc gtcgtggttg 
gatgccctcg ccggcccgga tcagctgacc 
cggccacccg gaatgccacc agatcggcca 
tcctcgacga tccggtcgtc gccgatccac 
caccacgccc aggccgatgt tgttcgccgt 
gcagcagcat cgccaccccg gacagctcct 
gcgatcagcc ggctgatcag gtcgtcaccg 
gatgagtcgg cccagatagc gcagcagcgc 
cctcgtcggt cgaactgagc cggaccagca 
aagtcccggt cgaccttggg caccccgagc 
cggcaccggc agcgcgaagg actcgacgag 
ccagcatcgc gtcgatccgc tcgtccacga 
agttcccgca ccttgcgcac ggtgaactcg 
ccggccgtgc tccggtgggt ccatcgccac 
actgcgacgg ggccccaccg gtccgaccgg 
gaggggtcgg cgctgatccg ggggtcggtc 
gtgccgggtc accaaccaga ccgggccgct 
ggcccgcccc gccgcggtag gtcgcgtact 
cccggcctgc gcagcgggaa ggccaccggg 
cgcgtcggct tcggtgctca tgccactccg 
ggtgatgaac tcctgctcct gtgcggtcag 
agaagccgtc ggcgctgagc cgggccgcgt 
gagaagtaac cgggctgacg gctcatcggc 
gatgccctcg ccggcgaggt acgcgcacag 
gcaggtcgta catccacagc acgtcgcgcg 
gggatgtcgc gcagcgcctc gtcgtagcgc 
ggcgaggatg gtgtccagct gctcggtctg 
gcatgttggt catccggaag ttgtaggcca 
tggtccttgg tgaacgccat cgcccgcaga 
gtgcgggtcg tgggtcaggc agacgccgcc 
tgttggcgaa gagcgagaaa caggcgatgt 
tgcgcctcgg cggagtcctc caccacccgc 
gttcagcacg gcgtccatgt cgcactgccg 
tgatcacttt ggtgcgcggg gtgatcttct 
atgttcaggt cgtcgccgca gtccacgaac 
ggtcaccgcc caggcggacg cgatcatcgt 
cgtcaccggg gccgacgccc agcgcgcgca 



SUBSTITUTE SHEET (RULE 26) 



WO 01/59126 



107 



PCT/GB01/00509 



61101 gcgccagcgt cagcgccgtg 

61151 acgtcgttgt acgcggcgaa 

61201 ctgcgaagag atccagccgc 

61251 cgcggccctg gagccacggc 

5 61301 tttgagtctt cctcggtcag 

61351 aggtccgccg ccgcggcccg 

61401 gtgctccgcg cgctcggtga 

61451 tgtccaggac gtcctcggtg 

61501 ctcaccccga agtcctggcc 

10 61551 caacggccgg accaccagcg 

61601 cgttgccacc ggcatgggtg 

61651 acgtccagct gcgacggcac 

61701 ctcggcggcc ggcggcagca 

61751 acacctggtg gccccggccg 

15 61801 gccacctgct cacgggtcag 

61851 cacggacttc tgcgccgaca 

61901 cctggggcag cggcggcacc 

61951 atcgggaacg ggtagtccaa 

62001 cgcatggtcg atccgcgcca 

2 0 62051 gctcggtgcg gacccggttg 

62101 gtcaggaaca tcccgagcgt 

62151 ctgagccagg gacatcgcgg 

62201 ccgacggggt gtaggacttg 

62251 ctcggcacga acggcacccc 

25 62301 cagctcgtac ccgaactggc 

62351 cgacctcctc gacgatctcc 

62 4 01 tccggcgcga acgaatgccg 

62451 ccgctgcgtc acctccgcat 

62501 tctgcgagac ggtgtcgccg 

3 0 62551 accacggccg ccacctcgtc 

62601 gtccgccacg tcgcgccggg 

62651 tgagcaggcc gctttcggca 

62701 gcgcccatgg acaacacccc 

62751 gccacgcacc cgcccggccc 

3 5 62801 gccacccgag gcgcgtgttg 

62851 gaacgcgccg cgtggaaaac 

62901 cggggaatca ctagtcttcg 



gtgccggagg agcaggcgac gccgaacggc 
cgcctcctcg aaccgcctga cgtacggccc 
cgccgacggc ctccgtcaca tagtcgagct 
atggacaccg gatacgtaaa ggacatgggt 
tcggttgcca gggcgggaag gccgagcagc 
gccgccggcg tcccgcagca gcccggcgaa 
aggagggttg gtcgagcacg cgggtgatct 
tccacggtct ccggccggtc cagggtcagg 
ccggatcgcc tggtcgtcgc agtccaccca 
gctttccgaa gtacaggccc tcgtggtagc 
aagaacgcct tcacgttcgg atgggccagc 
ccagccctcg atccgcaggt tgtccggcag 
actcctgttg gccgcgcggg agtttccaca 
tccagtcgcc gggcgacctc caccagcgac 
ccgggtgatc gtgccgaagc ccatgtacac 
gccagtccga caggccgtcg tcgtccggtg 
atcgtgccca ccagccgcag cttcggatgc 
ctcccttacg gagtagcaca agacctgctc 
tcatctgccg cgcctggggc gcgatgccca 
tcctcctcga cgaccttgcg gacgtccgac 
ccgcagccgg aacagctggt tctcgatccg 
ccggcagccc cgagtgcggc accgggaaac 
gcgaacggga cgtgcgaggt gaggacgttg 
gagcacgaac ggaatgccct tggtgatcgc 
acatgctctc gatcaccatc agcgccggct 
tccaggcggc ggtacttcgc catccgcgac 
aatcaccgcc gcgtgcgcct tgaaccgcga 
acgtcgcgtc gtcccatgtg accgccgaca 
agcgacgcga accgaaccgg gctgccgtcc 
gcgcgctttc tcgtcggtgg cgaaccacag 
acaattcccc ggccagcacg agcagcggat 
taactgacga acaggatcgg ccgccgattc 
tcggaatgtg gcgggccgcc gggcccgcgc 
ggtcgccggg tgagtgcatt cgccgacgcc 
ccggaaggaa gggtcaccgg ccggcacccg 
gggtcggtta cttggtctca tgccacggac 
gcgcgcgacg gccctttccg ggccgtgtgg 
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62951 


ccaatgcccg 


tccccggcgc 


ccgtcattcc 


ttagggaaaa 


gtacagcgtt 




63001 


tgcgaacgta 


cgatccggca 


cgcagaggtg 


acctgaggcc 


aacttttccg 




63051 


caggggtgag 


caaggcatga 


cgatcggagc 


cgacgaggac 


ccggtggtgg 




63101 


tcgtcggaat 


ggcctgccgt 


tatccgggtg 


gggtcgccgg 


cccggaggac 


5 


63151 


ctgtgggaac 


tggtccgcac 


cggccgcgac 


gcgaccaccg 


ccttcccgga 




63201 


cgaccgcggc 


tgggacctgg 


ccgcactggc 


cggcgacgga 


cccggccgca 




63251 


gcgcgacccg 


cgagggcgga 


ttcctcaccg 


gcgccgccga 


cttcgacgcc 




63301 


gccttcttcg 


gcatgtcgcc 


ccgcgaggcc 


gtctccaccg 


acccgcaaca 




63351 


gcgcctcgtc 


ctggagaccg 


cctgggaagc 


cctggagcgc 


gccggcatcg 


10 


63401 


acccgcactc 


cctgcgcggc 


agccgcaccg 


gggtcttcgt 


cggcgccagc 




63451 


ggccaggact 


acgccgccgt 


cacccacgcc 


tcgcccgacg 


acctggacgg 




63501 


acacgccctc 


accggcctgg 


cccccggcgt 


cgcctccggt 


cgcctggcgt 




63551 


acgtcctggg 


cctcgaaggc 


cccgccgtca 


ccgtcgacac 


cacgtcctcc 




63601 


tcgtcgctgg 


tcgcgctgca 


ctgggcggtc 


cgcgccctgc 


gcgcggggga 


15 


63651 


gtgcagcacc 


gccctggccg 


gcggcgtcac 


ggtgatgtcc 


accccggccg 




63701 


ccttcgtcgg 


ccacacccga 


cagggcggcc 


tcgcgcccga 


cggccgctgc 




63751 


aagccgttct 


ccgacgacgc 


cgacggcacc 


gcctgggcgg 


agggcgtcgg 




63801 


catcgtcgtc 


ctggagcacc 


tgtccaccgc 


ccgcgccgcc 


ggcaaccccg 




63851 


tcctcgccgt 


gctgcgcggc 


tcggccgtca 


accaggacgg 


cgcctccgac 


20 


63901 


ggcctcaccg 


cacccagcgg 


tcccgcccag 


gaacgcgtca 


tccgcgccgc 




63951 


cctcgccgac 


gcccgactcg 


cccccgccga 


catcgatctc 


gtcgaggcgc 




64001 


acggcaccgg 


cacccggctc 


ggcgaccccg 


tcgaggcccg 


ggcgctgctc 




64051 


gccgcctacg 


gccaggaccg 


ggacccggac 


cgaccgctgc 


gcctcggttc 




64101 


cctgaagtcc 


accctcggcc 


acgcacaggc 


cgccgccggc 


atcggcggag 


25 


64151 


tgatcaagac 


cgtcctgacc 


ctgcggcacg 


gcctgatgcc 


gcgcatccgg 




64201 


cacctggcca 


cccccacccg 


ccaagtcgac 


tggtcccagg 


gcgccgtggc 




64251 


ccccctcacc 


gaccacacgc 


cctggccacc 


ggccgaccga 


ccgcgccgcg 




64301 


ccggcgtctc 


ctccttcggc 


atcagcggca 


ccaacgccca 


tgtgatcctc 




64351 


gaagaggcgc 


cgcccgccga 


cgtccccgtc 


acccggcccg 


gcaccctccg 


30 


64401 


ccccagcacc 


gtcccctggc 


cggtctccgc 


cgccacgccc 


gaagccctcg 




64451 


acgcccaact 


cgcccggctc 


cgcgcccacc 


tgcgcaccca 


ctcggacctg 




64501 


gacccgctgg 


acgtcggcta 


ctccctggcc 


accggccgcg 


ccgcgctccg 




64551 


ccaccgggcg 


gtcctcctgc 


cgcccgccga 


cggcaccgcc 


gcggacgccg 




64601 


tcgagcacgc 


ccgcggtgcg 


gcccaccagc 


gccgcaccgc 


cgtcctcttc 


35 


64651 


tccggccagg 


gcagccagcg 


cccgggcatg 


ggccgcgaac 


tcgccgcccg 




64701 


cttcccggtg 


ttcgccgacg 


cactggacga 


cgcgctgcgc 


gccctggacc 




64751 


ggcacctgga 


cggcccggtg 


cgcgaggtga 


tgtggggcac 


cgacgccgcg 
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64801 


ctcctggacc 


ggaccggctg 




64851 


cgccctccac 


cgcctggtcg 




64901 


gcggccactc 


cgtcggcgag 




64951 


agcctggagg 


acgcctgccg 


5 


65001 


ggcgctcccg 


gccggcggcg 




65051 


aagtggcccc 


gctgctcggc 




65101 


cccaccgcgg 


tcgtcgtcgc 




65151 


cgcccgcttc 


gccgaccgcg 




65201 


acgccttcca 


ctcgccgctg 


10 


65251 


gtcgtgagcc 


gactgacctt 




65301 


cctcaccggt 


gaactcgccg 




65351 


tccggcacgt 


ccgcgacacc 




65401 


gccaaggccg 


gcgccgacgt 




65451 


gtccgcgatg 


gcccgcgaca 


15 


65501 


tccccgccct 


gagcaaggga 




65551 


ctcggccgcc 


tgcacaccct 




65601 


cgccggcacc 


ggcgcccgcc 




65651 


acgtgcgcca 


ctggcccacc 




65701 


gccctcggcc 


accccctgct 


20 


65751 


cggcaccgtc 


tgctccggcg 




65801 


ccgaccacac 


cgtcgccggg 




65851 


gaactcgccg 


tgcgcgccgg 




65901 


actccacctc 


accaccccgc 




65951 


tccaggtgca 


cgtcggcccc 


25 


66001 


gtccacaccc 


gccccgacca 




66051 


caccggcacc 


ctcggcagca 




66101 


gcggcacccc 


ggccgcctgg 




66151 


gccgaccact 


acgaacggct 




66201 


cttccgcggc 


ctgcgggccg 


30 


66251 


acgtggaatg 


cccgcccggc 




66301 


caccccgccc 


tgctcgacgc 




66351 


caccgtgccc 


gtcgcctggc 




66401 


ccgcgctgcg 


ggtccgcatc 




66451 


accgcggtcg 


acgtgcacgg 


35 


66501 


cgcccgcccg 


ctgaccgacg 




66551 


aggcccgcgg 


cgagacgccc 




66601 


gcccgccccg 


gcccggccgg 



gacccagccc gccctgttcg ccgtcgaggt 
cgtccctcgg cgtcaccccc gacttcgtcg 
atcgccgccg cccacgtcgc cggcgtcctg 
cctggtggcc gcccgcgcca cgctgatgca 
cgatggccgc gctggaggcc accgaggacg 
gcacacctcg cgctggccgc cgtcaacggc 
cggagccgag gacgccgtgc ggcaactgac 
gccggcgcac cagccggctg gccgtctcgc 
atggagccca tgctcgacgc cttccgggac 
ccaccagccg tcgatcccgc tggtctccaa 
gcagtgagat caccagcgcc gagtactggg 
gtccgcttcg ccgacggcat caccgcactg 
cctgatcgaa ctcggccccg gcggcgtgct 
ccctcggccc cgacagcacc accgacgtcg 
cggcccgagg agaccgcctt cgccggcgcc 
cggcgtcccc gtcgactggc ccgccttcta 
gcgtcgaact gcccacctac gccttccagc 
ccgccccgcc cgaacggcgc cgggcccggc 
cggctccgcc gtcgaactcg ccgacggcgg 
ccctctccct ccgcacccac ccctggctcg 
cgggtcgtgc tgccggccac cgcgctgctg 
cgacgaggcg ggctgcgacg tcctgcacga 
cggccctgcc cgacgacgcc gccctgcacg 
gccgacacca ccgggcgccg cgccgtcacc 
ccacccggcc ggcgactgga cccgatgcgc 
ccccgccgtc cgcagccgaa gccgccacgg 
ccgccggccg acgccgaacc cctcgacctc 
cgccgaccgc ggcttcgact acggcccgac 
cctggcgacg cggcgcggag atcttcgcgg 
accgccgacg acgcccccga ccacggactg 
ggcccggcac gccgccatgg cggtggacgg 
acggcgtccg gctgcacgcc gtcggcgcca 
cgccccacca cgaccggcac gctgaccctc 
cgcgccggtc gtcaccgtcg aggccctcac 
aggaacgcgc cgccccgcgg acgccgcggc 
gccgacgccc gcccggcccg gcccgcggcg 
cgaacccctc ccggacacca ccgggtccca 
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66651 ccccaccgcc ggccacctcg 

66701 agctgctgga cctggtgcgc 

66751 ggccccgagg ccgtcggcac 

66801 ctcgttggcc ggcgtcgaac 

5 66851 tgcgcctgcc ggccaccctc 

66901 gcccaccgcc tcggggaact 

66951 ggcgtacgga gaggaactca 

67001 cgcaggacgg ccccgaacgc 

67051 gtctccgcac tccgccagaa 

10 67101 ggacatcgac acggtgtcgg 

67151 agttcgaaac cacatagaga 

67201 gaggacggac cgatgcagga 

67251 gaaaatcgtc gactatctcc 

67301 gccgccgcat tggcgaactg 

15 67351 gtcggaatgg gctgccgact 

67401 gtgggacctg gtgcgttccg 

674 51 accgcggctg ggacctggaa 

67501 gccacccacg aaggcggatt 

67551 cttcttcggc atctcgccgc 

2 0 67601 gcctcctcct cgaagtcgcc 

67651 cccacagccc tgcgcggcag 

67701 ctggggcgcg ccctcggccg 

67751 tgaccggcac cgccgccagc 

67801 ggcctcgaag gcccggccgt 

2 5 67851 cgtcgccctg cacctggcgg 

67901 tcgccgtgat cggcggcgtc 

67951 gagttcagcg cccagggcgg 

68001 ctccgacgcc gccgacggca 

68051 tcgccgagcg gctctccgac 

3 0 68101 gtgctgcgcg gctccgccgt 

68151 cgcccccaac ggcccctccc 

68201 ggaccggcct gacccccgcc 

68251 ggcacccggc tcggcgaccc 

68301 cggccaggga cacacccccg 

3 5 68351 ccaacatcgg gcacacccag 

68401 atggtcatgg cgctgcgcca 

68451 cgcgccctcc tcgcacgtgg 



ccgcgctgcc gccggccgcc cgggagcgcc 
acccaggccg ccgccgtcct gggccacccc 
ccgcagcgtc ttcaaggagc tgggcttcga 
tcgccgaccg gctcaccgcc cgcaccggac 
gtcttcaact tccccacccc cgaacgtgct 
cctcgccgca accgcccccc tcgaccccgg 
ccaggttcga ggcgatcgtg acgaacctgc 
cgggccgtcg cggaccggtt ggacgccatc 
ctcgcctgca gaggtgccct cctcggacga 
tcgacagact gctcgacatc atcgatgaag 
aattgttgct ttcgttcgcg acccgatgac 
accccagcaa ggccagccgg accagcagga 
ggcgggtcac ttcagatctt cgccgtgccc 
gaatccaagg acaacgagcc catcgccatc 
tcccggcggc gtcaattcgc cggaatccct 
gcggcgacgc catttccgga ttccccgtcg 
accctcaccg gaaacggcga cggcagcagc 
cctctacgac gccgcggaat tcgacgccgc 
gcgaggcgac tgctatggac ccccagcagc 
tgggaggcgc tggagcgcgc cggcatcgcc 
ccggtccggc gtgttcgtcg gctcctacca 
acgccgccac cgaactgcac ggccacgccc 
gtgctgtccg gccgcctggc ctacaccctc 
caccgtcgac accgcctgct cctcctccct 
cccagtccct gcgcgtcggc gaatcctcgc 
acgatcctca ccgagccgtc cgtcttcgtc 
cctggcaccg gacggccgct gcaaggcgtt 
ccggttgggc cgagggcgtc ggtgtcctcg 
gcgcagcgca acggccatcc ggtcctcgcc 
caaccaggac ggcgcctcca acggcctgac 
aggaacgggt catccagcag gccctcgccc 
gacatcgacg ccgtcgaggc gcacggcacc 
catcgaggcc caggccctgc tcgccaccta 
accagccgct gtggctcggc tccctgaagt 
gcggccgccg gcgtcgccgg tgtcatcaag 
cggccacctg ccgccgaccc tgcacgccga 
actggtccgc cggatcggta cgcctgctga 
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68501 


ccgagggcca 


gcagtggccg 


gagaccggac 


gtccgcgccg 


ggccgcggtg 




68551 


tcctcgttcg 


gcatcagcgg 


caccaacgcg 


cacgccctgc 


tggaacaggc 




68601 


accccacccc 


gcggacaccg 


cggacgccgg 


cgacgacgcc 


gcgcccaccg 




68651 


aaccggccgg 


cgcgcccgcc 


gcgctgccct 


ggatcgtctc 


cggacactcc 


5 


68701 


ccgcaggcgc 


tgcgcgacca 


ggccgccgcc 


ctggccgcca 


gggtcgagac 




68751 


cgaccccgcg 


ctccgccccc 


aggacatcgg 


gcacaccctg 


cacaccgccc 




68801 


gcgccctgct 


cgaacgacgc 


gccgtcgtcg 


tcgcccccga 


ccgcgccgaa 




68851 


ctcctcgcgg 


ctacccacga 


gttggccgcc 


ggccggtccg 


cgaacgccgt 




68901 


cgtcgagggc 


ctcgcggacg 


tcgagggtcg 


gacggtgttc 


gtgttccccg 


10 


68951 


gtcagggttc 


gcagtgggtg 


gggatggggg 


cccaactcct 


cgatgagtcg 




69001 


gcggtgttcg 


cggagcggat 


tgccgagtgt 


gcggcggcac 


tcgccgagtt 




69051 


caccgactgg 


tcgctggtcg 


atgtgctgcg 


gggtgtggtg 


ggtgcgccgt 




69101 


cgttggagcg 


ggtcgatgtg 


gtgcagccgg 


cgtcgttcgc 


ggtgatggtg 




69151 


tcgttggctg 


cgttgtgggg 


ttcccgtggt 


gtgttgccgg 


atgcggtggt 


15 


69201 


ggggcattcg 


cagggtgaga 


tcgctgccgc 


ggtggtgtcg 


ggtgcgctgt 




69251 


cgttgcggga 


cggggcgcgg 


gtggtggcgc 


tgcggagtca 


ggccattggt 




69301 


cgtgcgttgg 


cggggcgggg 


cgggatgatg 


tccgtcgcgc 


tgtcggtgga 




69351 


cgtgctcgaa 


ccgcggttgg 


tcgagttcga 


ggggcgggtg 


tcggtggccg 




69401 


ccgtcaacgg 


cccgcgctcc 


gtcgtggtcg 


ccggcgagcc 


cgaggcgctg 


20 


69451 


gacgcgctgc 


acgcccggct 


gaccgccgac 


gacatccggg 


cccgccggat 




69501 


cgcggtggac 


tacgcctcgc 


actcgcacca 


ggtcgaggac 


ctgcacgagg 




69551 


aactgctgga 


ggtgctggcg 


gagctggcgc 


cgcgcacgtc 


ggaggtgccg 




69601 


ttcttctcga 


ccgtgaccgg 


cgactggctg 


gacaccgcgc 


ggatggacgc 




69651 


cggctactgg 


ttccgcaacc 


tgcgcggacg 


ggtgcggttc 


gcggacgcgg 


25 


69701 


tggcggacct 


gctggcggcg 


gagtaccgcg 


cgttcgtcga 


ggtcagctcg 




69751 


cacccggtgc 


tgacgatggc 


ggtcttggac 


ctgatcgagg 


aggccggggt 




69801 


cacggccgtc 


gcgaccggca 


ccctgcgccg 


tgaccagggt 


ggcgcgggcc 




69851 


gcttcctgct 


gtcggccgcc 


gaggtcttcg 


tgcgcggtgt 


ggacgtggac 




69901 


tgggcggggg 


cgttcgaggg 


gaccggtgcg 


gcccgggtcg 


acctgcccac 


30 


69951 


ctacgccttc 


cagcgcgagc 


ggtactggaa 


cacccgcacc 


gccgccgacc 




70001 


gcaccccggc 


cgacgccccg 


atggacgccg 


aattctgggc 


cgccgtcgaa 




70051 


caggcggacg 


tctccgcgct 


gaccgccgcg 


ctcggcaccg 


acgaggactc 




70101 


cgtcgccgcc 


atcctgcccg 


gcctcacctc 


ctggcgccgg 


gcccgctccc 




70151 


agcgcaccac 


cctcgactcc 


tggcgctacc 


gcgtcacctg 


gacgcccctc 


35 


70201 


gcccaggtgc 


cccgcgccac 


cctgaccggc 


acctggctgc 


tggtcaccac 




70251 


cgacggcatc 


gacgacaccg 


atgtggcagg 


ggcgttggag 


agctacggcg 




70301 


ccgaggtgcg 


ccggctggtc 


ctggacgagg 


agtgcaccga 


ccgcgccgtc 
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70351 


ctgcgggagc 


ggctggccgg 


cgcggaggac 


gtgaccggca 


tcgtctccgt 




70401 


cctcgccgcc 


gccgaggacg 


acgccgcacg 


ccaccccggc 


ctcacccggg 




70451 


gactcgcgct 


caccgtctcc 


ctcgtccagg 


ccctgggcga 


cgccgaggcg 




70501 


accgcgccgc 


tgtggttcct 


gacccgcggc 


gccttcgcca 


ccggcccgtc 


5 


70551 


cgaccccgtc 


acccggcccc 


tgcagagcca 


gatcgcgggc 


gtcggctgga 




70601 


ccaccgcgct 


ggagcacccg 


cagcgctggg 


gcggcaccgt 


ggacctgccc 




70651 


gacaccctcg 


acgcccgggc 


cgcccagcgg 


ctcgccgccg 


cgctgtccgg 




70701 


cgccctcggc 


gccgaggacc 


agctcgccgt 


ccgcgccgcc 


ggggtactgg 




70751 


cccgccgcat 


cgtgcgtgcc 


ggacaccgcg 


ccggacgacc 


ggcacggacc 


10 


70801 


tgggcgccgc 


gcggcaccac 


cctgatcacc 


ggcggctccg 


gcaccctcgc 




70851 


cccgcagctc 


gcccgctggc 


tggccgaacg 


cggcgccgag 


cacgtggtgc 




70901 


tggtcagccg 


gcgcggtgcc 


gacgcccccg 


gagcgcccga 


actcatcgcg 




70951 


gaggcagccg 


agtcgggcac 


cgaggtgacc 


gtcgccgcct 


gcgacatcac 




71001 


cgaccgcgac 


gcggtcgccg 


cgctgctggc 


cgacctcacg 


gccgacggcc 


15 


71051 


gcaccctgcg 


caccgtcatc 


cacgccgccg 


ccgccatcga 


gctgtccgcg 




71101 


ctcgccgaca 


ccaccgtggc 


ggagttcgcc 


gacgtcgtgc 


acgccaaggt 




71151 


caccggcgca 


cggatcctcg 


acgaactgct 


cgacgacgcg 


gaactggacg 




71201 


acttcgtcct 


gtactcctcc 


accgccggca 


tgtggggcag 


cggcgtgcac 




71251 


gccgcctacg 


tcgccggcaa 


cgcctatctg 


tccgcgctcg 


ccgagcagcg 


20 


71301 


ccgcgcccgc 


ggactgcgca 


ccacctccat 


ccactggggc 


aagtggcccg 




71351 


acgaccgggc 


acgcgagctg 


gccgacccgc 


accggatccg 


ccgcagcggt 




71401 


ctggagtacc 


tcgaccccga 


gctggcgctc 


accgcgctcc 


agcacgtcct 




71451 


ggacgacgac 


gagaccgtca 


tcggcctcat 


ggacatcgac 


tgggacacct 




71501 


accacgacgt 


gttcaccgcg 


ggccggcccg 


cgcacctctt 


cgaccagatc 


25 


71551 


cccgaggtgc 


ggcgccgcct 


cgaccaggca 


tccgtcccgg 


accccgcggg 




71601 


cccggccgcc 


gacggcctgg 


ccgcccgcct 


gcacggcctc 


gccgccgccg 




71651 


aacaggaccg 


gctgctgctc 


accctggtcc 


gcaccgaggc 


cgccgccgtc 




71701 


ctcggccacg 


cctcggccga 


gtccttcccc 


gagcgccgcg 


ccttccgtga 




71751 


cctcggcttc 


gactcggtca 


ccgccgtgga 


cctgcgcaac 


cggctcgtgg 


30 


71801 


ccggcaccgg 


actgcggctg 


ccctcgacga 


tggtcttcga 


ccaccccaac 




71851 


tgcgcggcgc 


tcgccgcgtt 


cctgaagacg 


acggcgctcg 


gcgtccccgg 




71901 


cgccgcaccg 


cagcagcacg 


ccgctaccgg 


caccccggcc 


gacgacgacc 




71951 


cgatcgccgt 


gatcggcatg 


agctgccgct 


accccggcgg 


cgccgccacc 




72001 


cccgaggaac 


tgctgcggct 


cgccctcgac 


ggcgccgacg 


tcatctcgga 


35 


72051 


gttccccgcg 


gaccgcggct 


gggacgcccg 


gggcctgtac 


gacccggacc 




72101 


ccgaccgccc 


cggccacacc 


tactccgtcc 


agggcggctt 


cctccacgag 




72151 


gccgccggct 


tcgatcccgg 


cttcttcggg 


atctccccgc 


gcgaggcggt 
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72201 cgccatggac ccgcagcagc 

72251 tcgaacgcgc cggtatcgac 

72301 accttcttcg gcgccagcta 

72351 cacgggggag tccgaggcgc 

5 72401 tgtccggccg ggtctcctac 

72451 gtggacaccg cctgctcctc 

72501 gtccctgcgc gacggcgaga 

72551 tgatggccac cccgcacgcg 

72601 gccaaggacg gccgctgcaa 

10 72651 gctcgccgag ggcgtcggcg 

72701 gcgccaacgg gcaccgggta 

72751 caggacggcg cctccaacgg 

72801 gcgcgtcatc cgccaggcgc 

72851 tcgacgcggt cgaggcgcac 

15 72901 gaggcccagg ccctgctcgc 

72951 gccgctgctg ctgggctcgg 

73001 cggccggcgt cgccggcgtc 

73051 gaactgcccg gcaccctgca 

73101 gaccgccggc gccgtggaac 

20 73151 gcgggcgccc ccgccgggcc 

73201 aacgcccacc tgatcctcga 

73251 ccccgaccgc ctccgggaca 

73301 tcgccgccaa gtccccggcc 

73351 gccaccgtcg agcacgaccc 

2 5 73401 cctggccacc acccgcgccg 

73451 agcgccgcga ggacttcctg 

73501 tcgacggccg gcctggtcag 

73551 . ggtcttcgtc ttccccggcc 

73601 aactcctcgc cacgtccgag 

30 73651 acggccctcg ccccgtacgt 

73701 cgagggcgac cccgccctgc 

73751 tgttcgccat gatggtcggg 

73801 gtcccggcgg ccgtggtcgg 

73851 cgtcgccgga gccctcagcc 

35 73901 gcagccaggc actgccgcaa 

73951 tccgcccccg tagagcgggt 

74001 gctgtccgtc gccgcggtca 



ggctcctgct ggagacctcc tgggaggcgt 
cccgcgtcac tgcgcggcag cgccgccggc 
ccaggactac tcctccaccg tgcagaacgg 
acatggtgac cggcaccgcg gccagtgtcc 
ctgctcggcc tggagggccc cgcggtcacc 
ctcactggtc gccctgcacc tggcctgcca 
gctccctcgc gctggccggc ggtgcggccg 
ttcgtcggct tcagccggca gcgtgccctg 
gccgttctcc gacaccgccg acggcatgac 
tcgtcctgct ggagcgcctg tcccacgccc 
ctggccgtga tccgcggttc cgccgtcaac 
cctgaccgcg cccaacggcc cgtcccagca 
tcgccaacgc cggcctgacc ggcgccgacg 
ggcaccggca ccaagctggg cgaccccatc 
cacctacggc caggaccgcg acgccgaacg 
tgaagtccaa catcggccac acccaggccg 
atcaagatgg tgctggccat ggacgccggc 
cctcgacgcg ccctccagcc acgtcgactg 
tgctgcgcgg gcgcaccccg tggcccgaga 
ggtgtctcct cgttcggcat cagcggcacc 
acaggccccg gccaccgagc cgccagccga 
ccgccaccga caccgtcgtc ccctggccgc 
gccctgcgcg cccaggccgc ccggctcctc 
cgacctcccg cccgcccccg tgggccacgc 
ccctcgaaca ccgcgccgtc gtcgtcggcg 
cgcggcctgg ccgccctgtc caccggcgcc 
cggcatcgcc ggccccgacc ccgagggagc 
agggatccca gtggtgggga atgggccgcg 
gtgttccgca ccgcgatcga tgactgcgcg 
cgactggtcg ctgcacgacg tcctggccgg 
tggagcgggt ggacgtggtc cagcccgcgc 
ctgtccgcgc tctggcgctc ccacggcgtc 
ccactcgcag ggcgagatcg ccgcggcctg 
tggccgacgc cgcccgcgtg gtggcgctgc 
ctgtccggac gcggcggcat gatgtcggtc 
caccgcactc ctcgccccgt ggcaggaggc 
acggcccctc gtccgtggtc gtctccggcg 
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74051 acaccgacgc gctcgacgcc ctgcacaccg cctgccagga acagggcgtg 

74101 cgggcccgca aggtgtccgt ggactacgcc tcgcacgggc ggcacgtcga 

74151 ggccgtccgc gacgaactcg cccgcgtcct cgcgccggtc gacccgcgcg 

74201 cccccgaggt gccgttctac tcgacggtca ccggcgaccg cgtggacgac 

74251 gccgccttcg acggcgccta ctggtacacc aacctccgcc agaccgtccg 

74301 catggaggag gccacccgcg ccctcctcgc cgccggacac cgcgtcttca 

74351 tcgaggtcag cccgcacccg gtgctcgccg ccccgatcca ggagacgcag 

74401 gaggccgtag cggaggccac cggcgggtcc gcggtggtcc tcggctcgct 

74451 ccgccgcgac gagggcggcc cgcggcgctt cctgacgtcg ctcgccgagg 

74501 cccacaccca cggcgccccg gtcgactgga ccaccacctt cgcccggtcc 

74551 gcctaccagc cggtggacct gccgacctac cccttccaac gacaggactt 

74 601 ctggcccgag gcccggcccg ccaccccggc cgccggcgcc gacgcgtccg 

74651 acgccgcgtt ctggcaactg gtcgagaacc aggacctcgc cgcgctcgcc 

74701 gacgcgctcg gcgtccccgc cgacgacgag cacaccgcgc tcggcaccgt 

74751 gctgccggcc ctgtccgcct ggcgcgccaa ggcccaggcc cgcacccgga 

74801 tcgacgaact ccgctaccac gtccagtgga cccgggtcgc cgagcccgcg 

74851 gcggccccca ccaccggccg gctgctggtc gccgtcccgc cggaccacgc 

74901 cgacgccccc tgggtcgccg cggcgctcga cgccctgggc accgacaccg 

74951 tccgcttcga ggccaagggc accgaccgcg cgggatgggc cgcacagatc 

75001 gcccaactcg tcgaggacgg cgaggagttc accggcgtgg tgtcgctgct 

75051 ggccgccgcc gaggatctcc acccggactt cggctcggta ccgctggggc 

75101 tggggcagac cctcgtcctc gtccaggccc tcggcgacgc cggcctgacc 

75151 gcgcccctgt ggtgcctgac ccgcggcgcc gtcgccaccg gccgcgacga 

75201 cgccctcgac agcccgaccc agggcgccct gtggggcctc ggccgggtcg 

75251 tggccctgga acaccccgac cgctggggcg gcctgatcga cctgcccgcc 

75301 accctcgacg cccgcgccgc ggcccgcctc accggcctgc tcgccgaccc 

75351 cgccggtgag gaccaactcg ccgtccgcgc caccggcgtg ctcgcccgcc 

75401 . gcatggtgca cgccgcgccg tccgcgcccc gcaccgggcg ccgctggcgg 

75451 ggccgcggca cctgcctgat caccggcggc accggcggca tcggcggccg 

75501 ggtcgcccgc tggatggccg agcacggcgc cgcccacctg gtcctgacca 

75551 gccggcgcgg cccggacgca cccggcgccg ccgcactccg ggccgaactg 

75601 gaggccctgg gcgcccgcgt caccctcgcc gcctgcgacg tcgccgaccg 

75651 cgacgccctg gccgccctgc tggccgacct ccccgccgac cagccgctca 

75701 cctccgtctt ccactccgcc ggcgtggccg acggggacgc ccgggcagcc 

75751 gacctgaccc tagatcagct cgacgcgctg ctgcgcgcca aactgaccgc 

75801 cgcccaccac ctgcacgagc tgaccgcccc cctcgacctc gacgcgttcg 

75851 tgctcttctc ctccggcgcc gcggtctggg gcagcggcgg ccagcccggc 
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75901 tacgccgccg ccaacgccta 

75951 cctcgacctg cccggcgcgt 

76001 gcatggccac cgtccccgag 

76051 cgcgccatgg aaccggacca 

5 7 6101 ggacgacgac accaccctcg 

76151 caccgagctt caccgcgacc 

76201 gaagccgtcc gcgcggtgac 

76251 cgtggactcc gcgaccccgc 

7 6301 ccgccgagcg cggccgggcc 

10 7 6351 gcgaccctcg gccacgacac 

76401 ccgcgacgtc ggcttcgact 

7 6451 tgcgcaccgc cctcggcctg 

7 6501 cccaccccca cggccctcgc 

7 6551 cgcccccgag gacgccggca 

15 7 6601 tccgcgaggc gctcgccacc 

7 6651 ctcctcgaca tggtgctgaa 

76701 cgcccccgag gccgacgccc 

7 6751 aagccctgct gcggctggcc 

7 6801 ggagcccacc atgagcacga 

20 76851 cgtccctgaa ggagatcgag 

7 6901 gcggcggcgg tcgagcccgt 

7 6951 cggcggcgtc acctcccccg 

77001 gcgacgtcat cgggccgttc 

77051 ttggccggcg gcggcgaggg 

25 77101 cgaggacgcc gccggcttcg 

77151 aagcggtcgc catggacccg 

77201 gaagccctgg aacgcgccgg 

77251 caccggcgtc ttcgtcggca 

77301 aggcgtccgc cgaggacgtc 

30 77351 agcgtcatct ccggccggct 

77401 cgtcaccgtc gacaccggct 

77451 ccgtccaggc gctgcgcggc 

77501 gcgtccatca tggccacccc 

77551 cggcctggcc gccgacggcc 

35 77601 gcaccggctg gggcgagggc 

77651 gacgcccagc gcgagggccg 

77701 catcaaccag gacggcgcct 



cctcgacgcc ctcgccgccc accgcaggtc 
ccgtcgcctg gggcacctgg ggcgaggtcg 
gtccacgagc gactgcaccg ccaaggggtc 
cgcgatcggc gcgctccagc agatgctgga 
ccgtgaccct catggactgg gaggcgttcg 
cgacccagcg ccctgttcag cacggtgccc 
cggcgacccg ggcaccacgg ccggcgacga 
cgctccgccg ccacctggag gagctgtccg 
ctggtcgagg cggtccgcgc cgaggcgtcc 
ccccgacgcc atccccgccg gccgtgcctt 
cggtcaccgc cgtcgaactg cgcaaccggc 
ccgctgccgg ccgcgctcgt cttcgaccac 
cggccacctc ggcgcgctgc tcttcggcac 
ccggccgccc cgacgacccc gacgcccgca 
gtccccatcg gacggctgcg caaggcgggc 
actcgccgac ggagacgcga ccgacgcccc 
cctcggaatc cctcgacgac atggacgccg 
accgagaact cggcgaactg aaagagagct 
accccgacaa gtacgtcgag gcactccggt 
cggctgcgcc ggcagaacga acagctggtg 
cgcggtcgtc ggcatcggct gccgcttccc 
aggacctgtg ggagctggtc gccgaggggc 
ccgcaggacc gcggctggga cctggagaag 
cggcagcctc gcgcaggtcg gcggcttcgt 
accccggctt cttcggcatc tccccgcgcg 
cagcagcgca tcctgctgga gatcacctgg 
catcgacccg tccaccctgc gcggtacccc 
ccaccggcca ggactacggc gaggtcatca 
gaggtctact cgaccaccgg ccacgccgcc 
ctcctacacc ctcggcgccg agggcccggc 
gctcctcgtc cctggtcgcc ctgcactggg 
ggcgagtgct ccatggccct ggccggcggc 
gggcccgttc gtcgccttca ccgcgcagag 
gctgcaagcc cttctccgac cgggccgacg 
gccggcatgc tggtcctgat gcggctctcc 
cccggtcctc gccgtgctgc gcggctccgc 
ccaacggcct gaccgctccc aacggcccct 
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77751 cccagcagcg cgtcatccgc 

77801 gccgacatcg acgccgtcga 

77851 cccgatcgag gcccaggcgc 

77901 ggcccctgtg gctcggctcg 

5 77951 gcctccggtg ccgccggcgt 

78001 cgtgctgccg cgcagcctgc 

78051 ggaccgccgg ctccgtcgac 

78101 accggacgcg cccgccgcgc 

78151 caacgcccac gtcatcctcg 

10 78201 ccaccgaacc caccgtccgc 

78251 cgcaccgccg ccgccctcga 

78301 cgccgacacc cccgacgccg 

78351 acggacgcgc caccttcgaa 

78401 gaactcgccc acggaaccgc 

15 78451 cggccagggc tcccagcgcc 

78501 tcccggtgtt cgccgcggcc 

78551 cacctcgacc gcccgctgcg 

78601 gctgaacgac accggctggg 

78651 ccctctaccg cctggtcgcg 

20 78701 ggccactcca tcggcgagct 

78751 cctcgaagac gcctgcaccc 

78801 ccctgccgcg cggcggcgcg 

78851 gtcacccccc acctcaccga 

78901 cacctccgtc gtcgtcgccg 

25 78951 cgcgcttcac cgcccaggac 

79001 gccttccact cgccgctcat 

79051 cgccgcgggc ctgacctacc 

79101 tcaccggcac cgtcgccgcc 

7 9151 gtccgccacg tccgcgaggc 

30 79201 caccgaccgc ggcgtgacca 

79251 tgtccgccat ggcccaggaa 

79301 ctgctgcgca aggaccgccc 

79351 ccgcgcccac gtccgcggcg 

79401 gcaccggcgc gcgccgcgcc 

35 79451 cggttctggc cgaccgcggc 

79501 actgggcgcc gccgaccacc 

79551 acggggccgg ctacttgttc 



gccgcgctgg acagcgccca cctcaccgcc 
ggcccacggc accggcacca ccctcggcga 
tcctggcgac ctacggacag gaccggccgc 
gtgaagtcca acatcggcca cacccaggcc 
gatcaaaatg atcatggcgt tgcagcgcgg 
acgccaccga acccaccacg gacgtcgact 
ctcctcgacg agacggtcgc ctggcccgag 
cggcgtctcc tccttcggca tcagcggcac 
aacaggcccc caccgccccc gaagagccca 
cccgccgtcg tcccgtgggc gctctccgcc 
cgcccagcgc gcccgcctca ccggccacct 
accccctcga cgtcggctac gcgctcgccg 
caccgcgccg tcctgctccc cgacggcacc 
cggcgaaggc ccctgcgccg tcctcttctc 
cgggcatggg acgcgaactc cacgcccgct 
ttcgacgaga tcacagcgct cctcgacacc 
cgaggtcgtc tggggcaccg acgccgacct 
cccaacccgc cctgttcgcc gtcgaggtcg 
tccctcggcg tcacccccga cttcgtcggc 
cgccgccgcg cacgtcgccg gggtcctctc 
tcgtcgccgc ccgcgcccgc ctcatgcagg 
atgctcgcga tccgcgccac cgaggacgag 
cgacgtctcg atcgccgccg tcaacgggcc 
gcaccgagga agccgtcgcc gcgatcgggg 
cgcaagacca cccggctgcg ggtcagccac 
ggacccgatg ctggcggaat tccgcgccgt 
acgagccgcg catcccggtc ctctccaacc 
gtcgccgacc tgtgctccgc cgactactgg 
ggtccgcttc gccgacggcg tcaccgccct 
cgctcgtcga actcggcccg gacggcgtgc 
tccctgccgg acggcgccgc cgccgtgccg 
cgaggagctc tccgccgtca ccggcctggc 
tcacggtccg ctgggccggc ctcttcgacg 
gacctgccca cctacccctt ccagcaccag 
ccgcgccgcc caggacgtca ccgccgcggg 
cgctgctcgg cgccaccgtc gaactcgccg 
accagccggc tctccgtccg gacccacccc 
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79601 tggctcgccg accacggggt 
79651 cttcgtcgaa ctggccgtcc 
79701 tcgaggaact gaccctggcc 
79751 gtccaactcc aggtccgcgt 
5 79801 cctcggcatc ttctcccgcg 

79851 aacacgccac cggcgtcctg 
79901 ttcgacgcca ccgtctggcc 
79951 cggcgcgtac gagcgcctgg 
80O01 tccagggcct gcgcgccgcc 
10 80051 gtggccctgc ccgacggcgc 

80101 cccggccctg ctggacgccg 
80151 gcgccatcag ccgcggcggc 
80201 gccgccgccg gcgccaccac 
80251 ggacaccgtc accatcgccg 
15 80301 ccgtcgactc cctggtctcc 

80351 gccggcaccg tccaccgcga 
80401 ccagggccgc ccgggccccg 
80451 acccggacgc cctcgccgac 
80501 gccccccgcg acctggccgc 
20 80551 cctggtcgtc accaccctca 

80601. ccgcgcacgc caccaccgcc 
80651 gccgacgacc gcttcgccga 
80701 caccgacggc accgaccccg 
80751 ccgcccgcac cgagaacccc 
25 80801 gacaccggcc ggcccgaccc 

80851 ccacgacgag cccgacctcg 
80901 gcctggcccg tgtcccgctc 
80951 ggcaccgtcc tgatcaccgg 
81001 ccgccacctg gtcgccaccc 
30 81051 gccgcggccc ggccgccgac 

81101 gggctcggcg ccaccgtcca 
81151 cgccctcgcc gacctgctcg 
81201 tcgtcgtcca caccgccggc 
81251 accccgcagc gcctggacac 
35 81301 gcacctgcac gaggcgaccc 

81351 tctcgtccgt cgccgccacc 
81401 gccggcaacg ccttcctgga 



ccagggccgg gccctgctgc ccggcaccgc 
gcgccggcga cgaggccggc tgcgaccgcg 
gcccccctgg tgctgcccga gcgcggcggc 
cggcgccccc gacgccgccg gccgccgcac 
tcgaggacgg cttcgacctg ccctggtcgc 
accgccggcg ccggcgcccc cgaccccacc 
ccccagcggc gccgaacccg tcgacctcac 
ccgcactcgg cttccagtac ggccccgcct 
tggcgccgcg acaccgaggt ctacgccgaa 
ggacaccgac cccgccgcct tcggactgca 
cacaacacgc cgccgcctac gccgacctcg 
ctgccgttcg cctgggaagg cgtctcgctc 
cgtccgcgcc cggatcgccc cggccggcga 
tctacgacgc cgccggcggc accgtgctgt 
cgcgaggtcc ccgccgacgc acccggcgcc 
ctccctcttc cacgtcgagt ggaccccgct 
caccggccac cgtcgccgtc ctcggccccg 
accctccgcg ccaccggcat ccggaccacc 
cctcgccgac gccgaagggc ccgtccccga 
ccaccacccc gggcgccccc gtccccgacg 
gccgtcctcg ccctcgccca acagtggctc 
cgcccgcctg gtcctcgtca cccgcggcgc 
ccgccgcggc cgccggcggc ctgatccgca 
ggccgtttcg ccctcctcga cctcgccccc 
cgagaccctg gccaccgccc tggccgccag 
ccgtccgcgg caccgacgtg cacgccgccc 
gccaccgaac ccaccacctg gaacccggac 
cggcaccggc ggcctgggcg cggtcctcgc 
acggcgtccg ccacctgctg ctcgccagcc 
ggcgccgacg acctgacggc cgaactcacc 
catcgccgcc tgcgacgtcg ccgacccggc 
gcaccgtccc ggccgggcac ccgctcaccg 
gtcgtcgacg acggcgtcct cggctccctc 
cgtcctgcgg cccaaggccg acgccgcctg 
gccacctcga cctggacgcc ttcgtcctct 
ctcggcagcc ccggacaggc caactacgcc 
cgccctcgcc gcccggcgcg ccgccaccgg 
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81451 cctgcccgcc acctccctcg 

81501 tgacaagcag cctgtccgac 

81551 atgcccccgc tgaccctgga 

81601 ggccgccggg cccgccgccc 

5 81651 tgcgcaccca gggcgacatc 

81701 cccgtgcggc gcaccgccgc 

81751 ccagcggctc gccggcctcg 

81801 aactcgtccg cacccagatc 

81851 gaggtggaga ccggccgcca 

10 81901 cgccgtcgaa ctccgcaacg 

81951 ccgccaccat ggtgttcgac 

82001 ctgcgcgacg aactcctggg 

82051 cgtgccgacc cgtaccgccg 

82101 tggcctgccg ctaccccggc 

15 82151 ctggtcagcc agggcgccga 

82201 ctgggacctg gacaacctct 

82251 cccacgtccg cgccggcggc 

82301 gacttcttcg ggatgagccc 

82351 gcgcctgctg ctcgaactct 

20 82401 accccgcctc actgcgcgac 

82451 tacaacgact acggcaccac 

82501 cggcaacggc agcgccccga 

82551 tcggcctgga aggcccggcc 

82601 ctggtcgccc tgcactgggc 

25 82651 gttggcgttg gccggtggtg 

82701 tggagttctc gcggcagcgg 

82751 ttcgccgagg ccgcggacgg 

82801 ggtcctggag cggcagtcgg 

82851 ccgtggtgcg cggctcggcg 

3 0 82901 accgcgccca acggcccgtc 

82951 cagtggcggc ctgtccacgg 

83001 cgggtacgac gctcggtgac 

83051 tacggtcgcg accgcgaccc 

83101 gtccaacatc ggtcacaccc 

35 83151 agatggtcat ggcgatgcgg 

83201 gacgcgccgt cctcgcacgt 

83251 caccgagcag accgcctggc 



cctggggccc gtggacccag agcgtcggca 
ctcgacgtcg agcgcatcgc ccgctccggc 
acagggcacc gccctcttcg acgcggccct 
tcgccccggt ccgcctcgac ctgcccgtcc 
gccccgctgc tgcgcggcct gatccgcacc 
ccaggtctcg cagaccgccg acggcctcgc 
acgccgccgc ccgccgggaa gccctcctgg 
gcccaggtcc tcggccacgc ggacgccacc 
gttccaggac ctcggcttcg actccctcac 
ccctgaacac cgccaccggc ctgcggctgc 
tacccgacac cacacgccct cgccgaccac 
caccgaggcc gagtcgacca ccgccgtccc 
gcaccgacga cccgatcgtc atcgtcggca 
ggcatcgcct cacccgagga cctctggcgc 
cgccactggc ccgttcccca ccaaccgcgg 
acgaccccga ccccgaccgc ccgggccgca 
ttcctgcacg acgccggctc cttcgacgcc 
gcgcgaggcg atggccaccg actcccagca 
cctgggaagc cgtcgaacgc gccggcatcg 
tccggcaccg gcgtcttcgc cggcgtcatg 
cctgaccggc gacgagtacg aggcgttccg 
gcgtcgcctc cggccgcgtc tcctacaccc 
gtcacggtgg acaccgcctg ctcttcctct 
ggcgcaggcg ttgcgggcgg gggagtgctc 
tgacggtgat gtcgacgccg agcacgttcg 
ggtctggcgc ctgatggtcg ttcgaaggcg 
cgtggcctgg tccgagggcg tcggcatgct 
acgcggtgcg caacggtcac gagatcctgg 
gtcaaccagg acggtgcgtc caacggtctg 
ccagcagcgg gtgatccgtc aggcgttggc 
ccgacgtgga cgccgttgag gcgcacggca 
ccgatcgagg cccaggcgct cctggccacc 
cgagaacccg ctgctgctcg gctcgatcaa 
aggcagcggc cggtgtcgcc ggtgtcatca 
cacggcgtgc tgccgcagac cctgcatgtc 
cgattggagc gtcggcgccg tcgaactgct 
cggagaccgg ccgggcccgt cgcgccggtg 
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83301 tctcctcctt cggcatcagc 

83351 tccccgaccg ccgtccccgc 

83401 ggaaccgccg gccgtcccct 

83451 tccgcgacca ggccgcccgc 

5 83501 ctgcgccccg tcgacatcag 

83551 cgaccaccgc gccgtcgtcc 

83601 ccctcaccgc cctcgccgcc 

83651 accgtccgca ccggccgcac 

83701 acggctcggc atgggccgcg 

10 83751 aagccctcga caccgtcctc 

83801 ctccgcgaca tcatctgggg 

83851 ctacacccaa cccgccctgt 

83901 tggaagcctg gggcatcaca 

83951 gagatcgccg ccgcacacgt 

15 84001 ccgcctcgtc gtggcccgcg 

84051 gcgcgatgat cgccgtccag 

84101 accgacgacg tctcgatcgc 

84151 ctccggctac gagaacgcca 

84201 agggccgccg caccacgcgg 

20 84251 ctgatggcgc cgatgctcga 

84301 cttcaccgcc cccacgaccc 

84351 ccccggccga ggcgctctgc 

84401 gaggcggtcc gcttcgccga 

84451 caccaccttc gtcgaactcg 

25 84501 aggagtccgc ccccgaaggc 

84 551 cggcccgagg aacaggccgt 

84601 cggcgtcgag gccgactggt 

84651 gcgtcgacct gccgacctac 

84 701 gcccgacccg cccgccccga 

30 84751 cgaacacccc ctcctcggcg 

84801 cactcttcac cggccgcctc 

84851 cacaccgtcc tgggcaccgt 

84 901 cgccgtccgc gcgggcgacg 

84951 ccctcgccgc gcccctgacc 

35 85001 gtccgcgtcg gatccgccga 

85051 cgcccgcccc gacgacaccg 

85101 gtgtgctcgc caccacgcca 
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ggcaccaacg cccacgtcgt catcgagcag 
cacgcccgcg tccgccgacc ggtccgtcga 
gggccctgtc cggcaagacc cccgacgccc 
ctcctcg.ccc acgtcgaggc ccaccccgca 
ctactccctg atcgccaccc gcaccgcctt 
tcggcaccga ccgcgccgag gccctgcgcg 
ggcgagaccg acccggccgc cctcaccggc 
cgccttcctc ttctccggcc agggctccca 
tcctctacga gcggttcccc gccttcgccg 
accgccctcg acgcggaact cggccacccc 
cgaggacgct caactcgtcg accggaccgg 
tcgccatcga ggtggcactc ttccgcctcc 
ccggacttcg tggccggcca ctccatcggc 
cgccggcgtg ctctccctcg gcgacgcctg 
ccgtgctgat gcagtcgctg cccgaaggcg 
gccaccgagg acgaggtcct gcccctcctc 
cgccgtcaac agcccgacct ccgtcgtcgt 
ccctcgccgt cgcccggcac ttcgccgacc 
ctgcgcgtca gccacgcctt ccactcgccg 
cgacttccgc gccgtcgtcg agagcctcac 
ccgtcgtctc caacctgacc ggcgaactgg 
tcggccgact actgggtccg gcacgtccgc 
cggcatccgc accctcgccg accgcggcgt 
gccccgacag cgtgctgtcc gccatggccc 
gccggcacca tcccgctcct gcgccgcgac 
cctggccgcc ctctgccacc tccaggtgct 
ccgccacctt ccgcggcctc gaccccgtcc 
gccttccagc accgctggtt ctggcccgcc 
cgacgtccgc gccgccggcc tgggcgccgc 
ccgccgtgca actccccgac gacgacggcg 
tccctgcgca cccacccgtg gctggccgac 
cctgctcccg ggcaccgcac tggtggaact 
agaccggcag cggccacctc gaagaactca 
ctccccgagg acggcg.ccac cctcctccag 
cgacaccggc cgccgcaccg tcaccgtcca 
ccgaccgcac ctggacgctg cacgccaccg 
ccggccgccg cggcgttcga caccacggtc 
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85151 tggccgcccg ccgacgccga 
85201 cttcaccacc caccgcttcg 
85251 ccgcctggcg cgccggcgac 
85301 tccgccaccg acgaagcggc 
5 85351 cgccggcctg cacgccgcgc 

85401 cgttctcctg ggaaggcgtc 
85451 cgcgtccggc tcgccccgaa 
85501 cgacccggcc ggcaaccccg 
85551 ccctggacgc cgagcagttg 
10 85601 ctcttccacc tggactggac 

85651 cgcgccgccg gccctcctcg 
85*701 tcggcgaccc ggccgtcgca 
85751 ggggacacca ccccgcccgc 
85801 cgacggcgac accgcgcagc 
15 85851 ccctcgtcca gcagtggctc 

85901 gtcttcgtca cccacggagc 
85951 cctggccgcc gccgcggtct 
86001 accccggcac cttcaccctc 
86051 accgcgctca gccgcgccct 
20 86101 cgccggccgc gcccgcgccg 

86151 ccaccaccca cacgccgtgg 
86201 ggtacgggtg gtctgggtgg 
86251 tggggtgcgg catttgttgt 
86301 gtgcggccgg gttggtcgcg 
25 86351 gttgcggcgt gtgatgtggg 

86401 cggtgtgtcg gagtcgtatc 
86451 tgttggatga cggtgtggtg 
86501 gtgttgcgtc cgaaggtgga 
86551 tggtctggat ctggacgcgt 
30 86601 tcgggggtgc gggtcaggcc 

86651 gcgttgatgg ttcatcgggt 
86701 gtggggtgct tgggatcagg 
86751 gggatgtccg tcgtgctgct 
86801 cagggtgtgg cgttgttcga 
35 86851 ggtgccggtc cgtctggacc 

86901 caccgctcct ccgcggcctc 
86951 accggcctcg ccaccggcgc 



acccctcacc accgacgact gctacgcaca 
cctacggccc cgccttccag ggcctgcggg 
gtgctgtacg ccgaggtcgc cctgccggag 
cgccttcggc ctgcacccgg cgctcctgga 
tcctcgccga cgaccgcgac accggactcc 
actctgcacg cctccggcgc caccgcgcta 
cggccccaac ggcctgtccg tcaccgccgc 
tcgccaccgt cacccgcctg ctcgcccgcc 
accatccaca gcgccctgac ccgcgacgcg 
cccggtcccg cttcccgaca ccgccaactc 
gcccggacac cgccgtgctc gccgacgccc 
cgccacgcaa ccctcgacga cctcctggcc 
cacggtcctc gtccccctcg gcgccccact 
acgcgcacgc cctcacccgc agcgcgctga 
gccaccgacc gcctcgccga ctcccgcctg 
cgtcgccacc gacgacgcgc cccccaccga 
ggggcctgat ccgctccgcg cagaccgaga 
ctcgacctcg acaccgagcc cgactcgacc 
gaccctcgac gaaccacagc tcctcctccg 
cccgcctcac ccgcaccccc gcccccacca 
tccgcggacg gaacggtgtt ggtgacgggt 
gttggtggcc cggcatctgg tgcggtcgtg 
tgaccagtcg ttctggtgtg ggtgctgcgg 
gagttggagt cgttgggcgc gcgggttgtg 
tgatggctcg gctgttgcgg agttggttgc 
cgttgtctgc ggtggtgcat gcggctggtg 
ggttcgttga cgccggagcg gttggctgcg 
tggtgcgtgg aacctgcatg aggcgacgcg 
ttgttgtctt ctcgtctgtt gcgggtgtgt 
aactatgcgg cgggtaatgc gtttttggac 
ggctggtggg ttgcctggtg tgtcgttggc 
gtgtggggat gacggcgggg ctgacggagc 
gagtcgggta tgccgttgtt gacggttgat 
tgcggcgttg gcgacgggga gtgccgcgtt 
tggccgcact gcgcacccgg ggcgacatcg 
gtccgcgcac cgctgcgccg caccgcggcc 
ggacaccggc ctcgtccaac ggctcggccg 
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85151 tggccgcccg ccgacgccga 

85201 cttcaccacc caccgcttcg 

85251 ccgcctggcg cgccggcgac 

85301 tccgccaccg acgaagcggc 

85351 cgccggcctg cacgccgcgc 

85401 cgttctcctg ggaaggcgtc 

85451 cgcgtccggc tcgccccgaa 

85501 cgacccggcc ggcaaccccg 

85551 ccctggacgc cgagcagttg 

85601 ctcttccacc tggactggac 

85651 cgcgccgccg gccctcctcg 

85701 tcggcgaccc ggccgtcgca 

85751 ggggacacca ccccgcccgc 

85801 cgacggcgac accgcgcagc 

85851 ccctcgtcca gcagtggctc 

85901 gtcttcgtca cccacggagc 

85951 cctggccgcc gccgcggtct 

86001 accccggcac cttcaccctc 

86051 accgcgctca gccgcgccct 

86101 cgccggccgc gcccgcgccg 

86151 ccaccaccca cacgccgtgg 

86201 ggtacgggtg gtctgggtgg 

86251 tggggtgcgg catttgttgt 

86301 gtgcggccgg gttggtcgcg 

86351 gttgcggcgt gtgatgtggg 

86401 cggtgtgtcg gagtcgtatc 

86451 tgttggatga cggtgtggtg 

86501 gtgttgcgtc cgaaggtgga 

86551 tggtctggat ctggacgcgt 

86601 tcgggggtgc gggtcaggcc 

86651 gcgttgatgg ttcatcgggt 

86701 gtggggtgct tgggatcagg 

86751 gggatgtccg tcgtgctgct 

86801 cagggtgtgg cgttgttcga 

86851 ggtgccggtc cgtctggacc 

86901 caccgctcct ccgcggcctc 

86951 accggcctcg ccaccggcgc 



acccctcacc accgacgact gctacgcaca 
cctacggccc cgccttccag ggcctgcggg 
gtgctgtacg ccgaggtcgc cctgccggag 
cgccttcggc ctgcacccgg cgctcctgga 
tcctcgccga cgaccgcgac accggactcc 
actctgcacg cctccggcgc caccgcgcta 
cggccccaac ggcctgtccg tcaccgccgc 
tcgccaccgt cacccgcctg ctcgcccgcc 
accatccaca gcgccctgac ccgcgacgcg 
cccggtcccg cttcccgaca ccgccaactc 
gcccggacac cgccgtgctc gccgacgccc 
cgccacgcaa ccctcgacga cctcctggcc 
cacggtcctc gtccccctcg gcgccccact 
acgcgcacgc cctcacccgc agcgcgctga 
gccaccgacc gcctcgccga ctcccgcctg 
cgtcgccacc gacgacgcgc cccccaccga 
ggggcctgat ccgctccgcg cagaccgaga 
ctcgacctcg acaccgagcc cgactcgacc 
gaccctcgac gaaccacagc tcctcctccg 
cccgcctcac ccgcaccccc gcccccacca 
tccgcggacg gaacggtgtt ggtgacgggt 
gttggtggcc cggcatctgg tgcggtcgtg 
tgaccagtcg ttctggtgtg ggtgctgcgg 
gagttggagt cgttgggcgc gcgggttgtg 
tgatggctcg gctgttgcgg agttggttgc 
cgttgtctgc ggtggtgcat gcggctggtg 
ggttcgttga cgccggagcg gttggctgcg 
tggtgcgtgg aacctgcatg aggcgacgcg 
ttgttgtctt ctcgtctgtt gcgggtgtgt 
aactatgcgg cgggtaatgc gtttttggac 
ggctggtggg ttgcctggtg tgtcgttggc 
gtgtggggat gacggcgggg ctgacggagc 
gagtcgggta tgccgttgtt gacggttgat 
tgcggcgttg gcgacgggga gtgccgcgtt 
tggccgcact gcgcacccgg ggcgacatcg 
gtccgcgcac cgctgcgccg caccgcggcc 
ggacaccggc ctcgtccaac ggctcggccg 
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87001 actcgaccac gcccaacgcc acgaggcact gctcgacatg gtccgcagca 
87051 gcgccgcgct cgtcctcggc cacgccgacg gcaacgccat cgacgccgaa 
87101 cgcgccttcc gcgacctcgg cttcgactcg ctcaccgcgg tcgaactccg 
87151 caaccgtctg cgcaccgcca ccggcctgca cctgtcggcc accatggtct 
5 87201 tcgaccaccc caccctgtcc gccctcgcgg agcacctgcg ggacgagttg 

87251 ttcggcgcgg tcgagagcga ggtgcgggtg ccggtccagg cactgccgcc 
87301 gaccgccgac gatcccatcg tggtggtggg catggcctgc cgtttccccg 
87351 gtggtgtgac ctcgcccgag gacctgtggc gcctggtcga cgacggcacc 
87401 gacgccatca ccaccttccc gaccaaccgc ggctgggacc tggacaacct 
10 87451 ctacgacccg gaccccgagc acttcggcac gtcgtacacc cgctccggtg 

87501 gcttcctgca cgaggcgggg gagttcgacc cggcgttctt cggaatgagc 
87551 ccgcgtgagg cgctggcaac cgactcccaa cagcgtctcc tgctggaatc 
87601 ctcctgggag gcgatcgagc gggccggcat cgacccgctg accctgcgcg 
87 651 gcagcgccac cggcgtcttc gccggcgtga tgtacagcga ctacgggagc 
15 87701 atcctcggcg gcaaggagtt cgagggcttc caaggccagg gaagtgcggg 

877 51 cagcgtggcc tcgggccgcg tctcctacgc cctcggcttc gagggcccgg 
87801 ccgtcacggt ggacacggct tgctcttcct ctctggtcgc cctgcactgg 
87 851 gcggcgcagg cgttgcgggc gggggagtgc tcgttggcgt tggccggtgg 
87 901 tgtgacggtg atgtcgacgc cgagcacgtt cgtggagttc tcgcggcagc 
2 0 87 951 ggggtctggc gcctgatggt cgttccaagg cgttcgccga ggccgcggac 

88001 ggcgtcggct ggtccgaggg cgtcggcatc ctcgtcctgg agcgccagtc 
88051 ggacgcggtg cgcaacggcc acgagatcct cgccgtgatc cgcggctcgg 
88101 cggtcaacca ggacggtgcg tccaacggcc tgaccgcgcc caacggcccg 
88151 tcccagcagc gcgtcatccg tcaggcgttg gccagtggcg gcctgtccac 
2 5 88201 ggccgacgtg gacgccgtcg aggcgcacgg cacgggtacg acgctcggtg 

88251 acccgatcga ggcccaggcg ctcctggcca cctacggccg tgaccgcgac 
88301 cccgagaacc ccctgtggct gggctccctg aagtccaaca tcgggcacac 
88351 ccaggcagcg gccggtgtcg ccggtgtcat caagatggtc atggcgatgc 
88401 ggcacggcgt gctgccgcag accctgcatg tcgacgcgcc gtcctcgcac 
30 88451 gtcgattgga gcgtcggcgc cgtcgaactg ctcaccgagc agaccgcctg 

88501 gccggagacc ggccgggtcc gtcgcgccgg tgtctcctcc ttcggcatca 
88551 gcggcaccaa cgcccacgtc atcgtggaac agccggcgct cgtcgaaagc 
88601 ccggccgcgg agccgagcgg acgcgaaccc ggcgtcgttc cgctgccgct 
88651 gtccggaaag tcccccgagg ccctgcgcga ccaggccgca cgcctgctgg 
35 88701 ccgggttggc ggagcggccc gcgctgcgcc cgctcgacct cggctactcg 

88751 ctggcgacga cccgttcggc gttcgaccac cgggcggtgg tgctcgccac 
88801 cgaccgcgcc gatgcggtcc gcgcgctgac ggcgctcgcc gccgccgacg 
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88851 cggatctctc cgccgtcgtc 

88901 ctgttctcgg gtcagggctc 

88951 cgagcgtttc ccggtcttcg 

89001 tggacgccgc cttgcccgcc 

89051 gacgatgtcg agctgctgga 

89101 cgccgtcgag gtggccctgt 

89151 cggacttcgt ggccggtcat 

89201 gtcggggtgt tctcgctgga 

89251 gacgctgatg caggcgctgc 

89301 ccgccgagga cgaagtcacc 

89351 gccgtcaacg gcccgacctc 

89401 ccgcacggtg gccgaccggc 

89451 tgcgggtctc gcatgcgttc 

89501 gagttccgtg cggtggccga 

89551 cgtcgtctcg aacctgacgg 

89601 cggccgagta ctgggcgcgc 

89651 ggcgtcagca ccctggagaa 

89701 accggacggc gtgctgtccg 

89751 ccgccaccgt cccggccctc 

89801 ctcaccgccc tcgcccacct 

89851 ggcgttcttc gccggcagcg 

89901 ccttccagca cgccacctac 

89951 gccgcggccg tcggcctcac 

90001 ggtcgaactc gccgaaggcg 

90051 tgcagtcaca tccgtggctg 

90101 ctgcccggca ccgcactgct 

90151 cggttgcgac cgcgtcgagg 

90201 ccgagcgcgg tgcggtacag 

90251 accggccgcc gtaccgtcac 

90301 cgtgtcgtgg acccagcacg 

90351 cggccgacac cggtttcgac 

90401 cccctcgcca ccgacgactg 

90451 ctacgggccg gtcttccagg 

90501 tgctgtacgc cgaggtggcc 

90551 gccttcggtc tgcaccccgc 

90601 cgtcgcccac gagggcgagg 

90651 agggcgcgac cctctacgcg 



ggcgacaccc gcacgggtcg tcacgcggtg 
gcaacgcctg ggcatggggc gtgagttgta 
ccgaggctct cgatgtcgcg atcgaccacc 
caggccagtc tgcgtgaggt gatgtggggc 
cgagacgggt tggacgcagc cggctctgtt 
tccggctggt ggagagttgg ggtgtccgtc 
tccatcggtg agatcgcggc ggcgcatgtc 
ggacgcctgc cgtctggtgg ccgcccgtgc 
cgaccggcgg cgcgatgatc gcgatccagg 
cagcacctga ctgacgacgt ctcgatcgcc 
cgtggtcgtc tccggggccg agagcgctgc 
tcgcggagaa cggccgcaag acgacccggc 
cactcgccgt tgatggatcc gatgctggcg 
gggcctgtcc tacgccaccc cgaccctccc 
gccggctggc cacggccgat gacctctgct 
cacgtccgcg aggcggtccg cttcgccgac 
cgagggcgtc accacgttcc tggaactggg 
ccatggccca gcagtcgctc accggcgacg 
cgcaaggacc gcgacgagga gacgtccgcg 
ccacacggca ggtctccgcg tcgactgggc 
gcgccacccg cgtggacctg ccgacctacg 
tggcccaccg gcaccctgcc caccgcgcac 
cgccgccgag cacccgctgc tgaacggttc 
aaggggtgtt gttcaccgga cggctgtcac 
gccgaccacg ccgtcatggg acaggtcctg 
ggaactggcg ttccgggccg gcgacgaggc 
aactgacgct cgccgcaccg ctcgtcctgc 
acccaggtcc gggtcggcgt cgccgacgac 
cgtccactcc cggcccgagc acgcgaccga 
cgaccggcac cctgaccatg ggctccgccc 
gccactgcct ggccgcccgc cgacgccgaa 
ctacgcgcgc ttcacgacgc tcggcttcgc 
gcctgcgggc cgcctggcgc gccggtgacg 
ctggcggagt ccaccggcga cgaggcgacc 
actgctcgac gccgccctgc acgcctccct 
agagcaacgg cggactgccg ttctcctggg 
accggcgcca ccgcgctgcg cgtccggctg 
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90701 


accccgacgg 


gcaccgacgg 


ccgt t cggtg 


/-i /— /-* a f- n r~r~ n 
^vCai ^y i^i^y 






90751 


cgccggtcgt 


ccgg ccgccg 


CCalt.gat.ad 


l.l~y i_ ^ v» v_.y 






90801 


ccggcgacca 


gttgaccggc 


gccgcgggac 


tggcccy cy a 






90851 


accctggact 


ggaaccccgt 


accggagaac 


Cucg u a ccgg 




5 


90901 


accggagaac 


accggcgggg 


gccacgccca 


ggaccaggac 


ggccggcccg 




90951 


ccgcggccac 


cgtcgcgctg 


gtcggcgcgg 


acggcaccgc 


yd Lt.y oLy(,L 




91001 


gacctgaccg 


ccgccggcat 


ccacaccacc 


ctccaccccg 


aULLLdLLaL 




91051 


cctcgccacg 


accgacgccg 


acgttccgaa 


gacggtcctc 






91101 


ccggaaccgg 


aaccggaacc 


ggcaccggga 


ctgagtcgac 


n n /r n a a ^ 

gydcygdaUL 


10 


91151 


gggacggggg 


ccgccgagtc 


ggacgcgtcc 


gccccctccc 


cggccgaggt 




91201 


cgcccacacc 


ctgtccaccg 


ccgcactcgc 


cctcgtccag 


gagcggaccg 




91251 


cacaggagcg 


cttcgccggc 


tcccgcctgg cgttcgtcac 


gaccggggcg 




91301 


acggccgccg 


gcggtaccga 


cgtcatggac 


gtggccgccg 


ccgcggtctg 




91351 


gggcctggtc 


cgatccgccc 


agtccgaagc 


cccggacacc 


n m ^- a a ^ /T ^ 

ttcycccuyd 


15 


91401 


tcgaccgtga 


ccccggcccg 


gccggcacgc 


acgaccgcac 


agccgccgcc 




91451 


gaacggggcc 


aactgctcct 


acgggcactg 


cacaccgacg 


aaccgcag ci 




91501 


cgccctgcgt 


gacggcggcg 


tgctcgccgc 


ccgcctggcc 


/-•/-i ♦* ^ ^ /i o n 

cyccc cgaca 




91551 


ccgcggccgc 


gctcaccccg 


ccggccgacc 


gggcctggcg 


gctcgacagc 




91601 


acggccaagg 


gcagcctcaa 


cggcctcgcc 


ctgaccccgt 


atccggcggc 


20 


91651 


actggcgccg 


ctcaccggcc 


acgaggtgcg 


ggtcgaggtg 


cgtgccgcgg 




91701 


gcctgaactt 


ccgtgacgtg 


ctcaacgcgt 


tggggatgta 


tccgggtga t 




91751 


gatgtcggat 


cgttcggttc 


ggaggcggcc 


ggtgtggtcg 


t cgaggtcgg 




91801 


accggaggtg 


accggcctgg 


cccccggcga 


ccaggtcatg 


ggca tgatca 




91851 


ccggcagctt 


cggctcgctc 


gccgtggacg 


acgcgcggcg 


cctcgcccgc 


25 


91901 


ctgcccgagg 


actggtcctg 


ggagacgggt 


gcgtcggtgc 


cgttggtgt t 




91951 


cctcaccgcg 


tactacgccc 


tgaaggagtt 


gggtggtctg 


cgggcggggg 




92001 


agaaggtgct 


ggtgcatgcc 


ggtgccggtg 


gtgtcggtat 


ggcggcgatc 




92051 


. cagatcgccc 


ggcatgtcgg 


tgccgaggtg 


ttcgccacgg 


ccagtgaggg 




92101 


caagtgggac 


gtgttgcgct 


cgctcggcgt 


ggccgacgac 


cacatcgcct 


30 


92151 


cctcccgcac 


cctcgacttc 


gaggcggcct 


tcgccgaagt 


cgccggcgac 




92201 


cgcggcctgg 


acgtcgtcct 


caactccctc 


gccggtgact 


tcgtcgacgc 




92251 


ctcgatgcgg 


ttgctcggcg 


acggcggccg 


gttcctggag 


atgggcaaga 




92301 


ccgacatccg 


cgccgcggac 


tccgttcccg 


acggcctctc 


ctaccagtcc 




92351 


ttcgacctcg 


cctgggtggt 


gccggaaacc 


atcggcacca 


tgctggccga 


35 


92401 


gctgatggac 


ctcttccgca 


ccggcgcact 


gcggccactc 


ccggtccgca 




92451 


cctgggacgt 


ccggcacgcc 


aaggacgcgt 


tccgcttcat 


gagcatggcc 




92501 


aagcacatcg 


gcaagatcgt 


gctcaccctg 


ccccgctcct 


ggaagcccga 
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92551 gggaacggtg ttggtgacgg 
92601 cccggcatct ggtgcggtcg 
92651 cgttctggtg tgggtgctgc 
92701 gtcgttgggc gcgcgggttg 
5 92751 cggctgttgc ggagttggtt 

92801 gcggtggtgc atgcggctgg 
92851 gacgccggag cggttggctg 
92901 ggaacctgca tgaggcgacg 
92951 ttctcgtctg ttgcgggtgt 
10 93001 ggcgggtaat gcgtttttgg 

93051 ggttgcctgg tgtgtcgttg 
93101 atgacggcgg ggctgacgga 
93151 tatgccgttg ttgacggttg 
93201 tggcgacggg gagtgccgcg 
15 93251 ctgcgcaccc ggggcgacat 

93301 gcccatccgc cgcgcagccg 
93351 agcagctcac ccggctccag 
93401 ctcgtccgcg accaggccgc 
93451 cgtcgacccg tcccgcgcct 
20 93501 cggtcgaact ccgcaaccgc 

93551 gccacggccg tcttcgacta 
93601 gctcaccgaa ctgctcggcc 
93651 gcgaccccac cgcgggaccg 
93701 agctgccgct tccccggcga 
25 93751 gctcggcgac ggcgccgacg 

93801 gggacctgga caacctctac 
93851 tacgcccgca ccggcggttt 
93901 cttcttcggc atgagccccc 
93951 gcctgctgtt ggagtcctcg 
30 94001 ccgctgaccc tgcgcgacag 

94 051 cagcggctac ggcacccgcc 
94101 ggcagggcag cgcactgagc 
94151 ggcttcgaag gcccggccat 
94201 ggtcgccctg cacctcgccg 
35 94251 tcgccctcgc cggtggtgtc 

94 301 gagttctccc ggcagcgcgg 
94351 ctccgagtcc gccgacggcg 



gtggtacggg tggtctgggt gggttggtgg 
tgtggggtgc ggcatttgtt gttgaccagt 
gggtgcggcc gggttggtcg cggagttgga 
tggttgcggc gtgtgatgtg ggtgatggct 
gccggtgtgt cggagtcgta tccgttgtct 
tgtgttggat gacggtgtgg tgggttcgtt 
cggtgttgcg tccgaaggtg gatggtgcgt 
cgtggtctgg atctggacgc gtttgttgtc 
gttcgggggt gcgggtcagg ccaactatgc 
acgcgttgat ggttcatcgg gtggctggtg 
gcgtggggtg cttgggatca gggtgtgggg 
gcgggatgtc cgtcgtgctg ctgagtcggg 
atcagggtgt ggcgttgttc gatgcggcgt 
ttggtgccgg tccgtctgga cctggccgca 
cgcaccgctc ctccgcggcc tggtcaaggc 
ccaccacacc cggcgacacc ggactcgccg 
cgcgccgagc gacgggacac cctcctcgcg 
gatggtcctc ggccacacct cgggcgacgg 
tccgcgacct cggcttcgac tcgctcaccg 
atcggcgcgg ccaccggcct gcggctaccg 
ccccaccgcc gatgccctcg ccgcacacct 
ccgacgccga gtcggacccc gacgagcccg 
accgacgacc ccatcgtcat catcggcatg 
catcggctcg ccggaggacc tgtggcgcct 
tcgtcaccga cttcccgacc aaccgcggct 
gaccccgacc ccgcgcacgc cggcacctcg 
cctgcacgac gccgccgact tcgacgccga 
gcgaggccat ggccacggac tcccagcagc 
tgggaggcga tcgagcgggc cggcatcgac 
ccgcaccggc gtcttcgccg gcgtcatgta 
tcgacggcgc cgaattcgaa ggcttccagg 
gtggcctccg gccgggtctc ctacaccttc 
gacggtcgac accgcctgct cctcctcgct 
cacaggcact ccgcggcggt gagtgcaccc 
accgtgatgt ccatcccgga caccttcatc 
actggccccc gacggccgct ccaagccgtt 
tcggctggtc cgagggcgtc ggaatgctgc 
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94401 


tcctggagcg 


ccagtcggac gccgtgcgca 


acggccacca 


gatcctggcc 




94451 


gtggtgcgcg gctcggcggt caaccaggac 


ggtgcgtcca 


acggcctgac 




94501 


cgcgcccaac 


ggcccgtccc agcagcgggt 


gatccgtcag 


gcgttggcca 




94551 


gcggcggcct 


gtccacggcc 


gacgtggacg 


ccgtcgaggc 


gcacggcacg 


5 


94601 


ggcaccacgc 


tcggtgaccc 


gatcgaggcc 


caggccctcc 


tggccaccta 




94651 


cggccgcgac 


cgcgaccccg 


agaacccgct 


gctgctcggt 


tcgatcaagt 




94701 


ccaacctcgg 


ccacacccag 


gcagccgccg 


gtgtcgccgg 


cgtcatcaag 




94751 


atggtcatgg 


cgatgcggca 


cggcgtgctg 


ccccgcagcc 


tgaacatcac 




94801 


cgagccgtcc 


tcgcacgtcg 


attggagcgc 


cggcgccgtc 


gaactgctca 


10 


94851 


ccgagcagac 


cgcctggccg 


gagaccggcc 


gggcccgtcg 


cgccggtatc 




94901 


tcctccttcg gcatcagcgg 


caccaacgcc 


cacgtcatcc 


tggagcagcc 




94951 


ggaggccgcg 


cggcactcgg 


cgccggaaga 


agccgacacg 


gcggaggcag 




95001 


ccgccaaggc 


gccggccacc 


gcgcacctgc 


ccgtaatgcc 


gtgggcactg 




95051 


tccggcaaga 


cgccggaggc 


cctgcgtgcc 


caggccgcac 


gcctcctcgc 


15 


95101 


ccacctccag 


cagcgccccg 


aactcgcacc 


cgccgacatc 


gccctgtccc 




95151 


tcgccaccca 


gcgctcccag 


ttcacccacc 


gggcagtcgt 


cctgagcacc 




95201 


gaccgtgacg 


aggcgacccg 


cgcgctgtcc 


gccctcgcca 


ccaccgccgc 




95251 


gtccgacccc 


tcggccctca 


ccggcacggt 


caccatggga 


cgttgcgcgg 




95301 


tgctgttctc 


gggtcagggc 


tcgcaacgtc 


tgggcatggg 


gcgtgagttg 


20 


95351 


tacgagcgtt 


tcccggtctt 


cgccgaggct 


ctcgatgtcg 


tgatcgatca 




95401 


cctggacgcc 


gccttgcccg 


cccaggccgg 


tttgcgtgag 


gtgatgtggg 




95451 


gcgacgatgt 


cgagttgctg 


aacgagacgg 


gttggaccca 


gcccgcgctc 




95501 


ttcgccatcg 


aggtggcgct 


gtttcggctg 


gtggagagtt 


ggggtgtccg 




95551 


tccggacttc 


gtggccggtc 


attccatcgg 


tgagatcgcg 


gcggcgcatg 


25 


95601 


tcgtcggggt 


gttctcgttg gaggacgcgt 


gccgtctggt 


ggccgcgcgg 




95651 


gcgacgctga 


tgcaggcgtt 


gccggccggt 


ggcgcgatga 


tcgcggtcca 




95701 


ggcgaccgag 


gacgaagtca 


tcccgcacct 


gaccgacgag 


gtggcgatcg 




95751 


. cggccgtcaa 


cggcccgacc 


tccgtggtga 


tctcgggcgc 


agaagaggcc 




95801 


acgcagaccg 


tggcacaaca 


cttcgccgac 


caggggcgcc 


ggacgaccgc 


30 


95851 


gctgcgggtc 


tcgcatgcgt 


tccactcgcc 


gctgatgatg 


ctggcggagt 




95901 


tccgtgcggt 


ggccgagggc 


ctgtcctacg 


ccaccccgac 


cctccccgtc 




95951 


gtctcgaatc 


tgacgggcca 


ggtggccacg 


gccgacgaac 


tctgctcggc 




96001 


cgagtactgg gtgcgccacg 


tccgtgaggc 


ggtccgcttc 


gccgacggtg 




96051 


tgacggccct 


cgaagccgag 


ggcgtgcgga 


ccttcctgga 


actcggcccg 


35 


96101 


gacggcgtcc 


tcgccgccat 


ggccagggaa 


accgtcgccg 


acgacacggt 




96151 


caccgtcccc 


gtcctccgca 


ggaacatgcc 


cgaggaacgg 


accctgctca 




96201 


ccgcactcgg 


ccggctccac 


accaccggaa 


ccccgatcga 


ctgggccgcc 
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96251 ctcctggccc cgaccggcgc 

96301 ccaacaccgt cccttctggc 

96351 ccgccgtcgg catcgccggc 

96401 gaactcgccg acgaagaggg 

5 964 51 gtcgcatccg tggctggccg 

96501 ccggcaccgc actgctggaa 

96551 tgtgaccacg tcgaggaact 

96601 gcgcggcgcg gtacagaccc 

96651 ggcgccgcac cgtcacgatc 

10 96701 gacagtgaca cccacaccgg 

96751 cggcgtcctc gtcgccggcc 

96801 ccaccgtgtg gccgccggcg 

96851 tacgcgtccc gggccggcga 

96901 cctgcgagcg gcctggcgcc 

15 96951 tgccggaggc cggccgtacc 

97001 ctgctcgacg ccggactgca 

97051 gcccacacgg acgggcagcg 

97101 ccgcttccgg tgcctcctcg 

97151 ggaacgctga gcctggccat 

20 97201 cgtacaggcc ctctccatgc 

97251 ccgcgggcct cgcccgcgac 

97301 ccggagccgg cgtgccagcc 

97351 cgcggtcgtc ggtacggaaa 

97401 ccctgcgtgc ggcgggcgcc 

25 97451 gatgaacccg cgcccgcgct 

97501 gaccggcacc gcagaagcgg 

97551 cccgccgagc cctggccctc 

97601 gcggacacga agttcgtctt 

97651 tgtggctgct gctgcggtgt 

30 97701 atccgggttg ttttgctctg 

97751 gcggctgcgc tcgtcgctgc 

97801 gcgcggtgat gtgttgcggg 

97851 aggtcggtgc gggtgctgat 

97 901 ggtgtgtcgt tctcgggtga 

35 97951 tggtctgggt gcggtgttgg 

98001 gggatctgct gttggtcagt 

98051 gagttggtgg cggagcttgc 
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ccgcccggtg gacctgccga catacgcgtt 
cctccggccc ccgcgacacc gcggatgccg 
gcgagccacc cgctcctcaa cggcatcgtc 
cctgttgttc accggacggc tgtcactgca 
accacgccgt catgggacag gtcctgctgc 
ctcgccctgc gcgccggcga cgaggtcggc 
gacgctcgcc gcaccgctcg tcctgcccga 
aggtccgggt cggcgtcgcc gacaccaccg 
cactcgcgtc ccgcacgcgc cacgaccacc 
caccgacacc ccgtggaccc aacacgccac 
tgccggcgac ggcaaccgtc ccgttcgatg 
cacgccgaac ccgttgacct ggcggacttc 
aggattcggc tacgggcccg ctttccaggg 
gcgacggcga ggtgttcgcc gatgtcgcac 
gaagccgagg cgtacgggct gcatccggca 
cgcagcctgg ctcgtcgccc cggacgggga 
tgccgttctc gtggcgcggc gtcttcctgg 
gtccgcgtcc gactcggccg cgactccgac 
cgccgacacc accggtgcac cggtcgcgtc 
gcaccgtctc ggtgacggcc ctgagcgcca 
gcgctgttcc gcctggactg ggcctcggcc 
ggacgacacg gtgaccgtga tcccggcggt 
cctccgaact cacctccgag ctcaccgcgg 
gacgtcgacg tccgcacgac cctgtcgacc 
gatcgcgctg ccgctcgtcg cctccgacca 
cacccgtccc ggcggccgtg cacgacctca 
gtacagaccc gcctgcaaga gcagcacttc 
cgtgacccgt ggtgcgacgg tcgggcgtga 
ggggtctggt gcgttcggcg cagtcggaga 
gtcgatctgg atccggatgg tgcggtgggt 
gttggtcagt ggtgagccgc agcttgcggt 
tcgcgcgtct ggtgcggcgg ccgctcaccg 
ggcaccgggg atggcgtcgg gggtggctct 
gggtgcggtc ctggtcactg gtggtacggg 
cgcgtcatct ggtggccgag tatggggtgc 
cgcagtggtg aacgtgccgt gggtgctggg 
gggtgtgggt gcgcgggtgc gggtggttgc 
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98101 gtgtgatgtg accgatcgtg 

98151 cggtgtccgc ggtggttcat 

98201 ggtgcgttga ccggggagcg 

98251 tgctgtctgg catctgcatg 

5 98301 tcgtcgtctt ctcctctctc 

98351 aactacgcgg ccgcgaacgc 

98401 ggcggaggga ctgcccggcc 

98451 ccgatggcac atcgggcatg 

98501 cgttcgggag tgccaccgct 

10 98551 cgcggccctg gcgaccggtg 

98601 tgtcggcact gcgtgcccag 

98651 atccgaggcc gctcgcgccg 

98701 cgggctgcgg gaacggctcg 

98751 tcctcctgga cctcgtgcgc 

15 98801 gacgccgacg acgttcatcc 

98851 ctccctcacc tcggtggaac 

98901 tgcggctccc cgcgaccatg 

98951 gtctcctacg tcctggacga 

99001 cgtgcagccg gccgcggttg 

20 99051 gcatggcctg ccgctacccc 

99101 aggctcgtca cggacggcgt 

99151 tggttgggac gtcgaatccc 

99201 cctcctacac gcgctcgggt 

99251 ccggggttct tcgggatgag 

25 99301 gcagcggttg ttgttggagt 

99351 ttgatccggt gagtttgcgg 

99401 atgtacagcg attacagcgc 

99451 ccagggcagt gggagttcgc 

99501 cgttggggtt ggaaggcccg 

30 99551 tcgttggtgg cgatgcactg 

99601 tgggttggcg ttggccggtg 

99651 ttgtggactt tgctcggcag 

99701 gcgtttgcgg atgcggccga 

99751 gttggtcctg gagcggcagt 

35 99801 tggctgtggt gcggggttcg 

99851 ttgacggcgc ccaatggtcc 

99901 ggccagtggt ggtctgacgg 



ccgcggtggt ggagttggtt ggcgggcatg 
gcggctggtg tgctggatga cggcatggtg 
gttgtccgcg gtgctgcggc cgaaggtgga 
aggcgacccg cggcctggat ctggacgcgt 
gccggggtct tcggcagtcc cggccaggcc 
cttcctggac gcgctgatga cgcggcgccg 
tgtcactcgc atggggaccg tggtcgctga 
ctcgcggacg ccgaggccga tcgcctgacc 
gaccgcggag caaggactgg cactgttcga 
acgccacctg cgtcccggtc cgcctggacc 
ggtgaggtgc cgcccttgct gcggtccctg 
cgccgccgcc gcggaatcgg caaccgccac 
tcggactgaa cccggtcgag cgacaggaag 
ggccaggtgg ccctggtcct cggccacgcc 
ggcacgtgcc ttcagggagt tgggcttcga 
tgcgcaaccg cctcaacacc gtcaccggtc 
gtgttcgact atccgaccgt cgaagtgctg 
gttgctgggc acggatgccg aggtggcgac 
cggtggcgga cgatccgatc gtcatcgtgg 
ggcggggtcg cctctccgga cgacctgtgg 
cgacgcggtg tcccccttcc cgaccaaccg 
tctatcaccc ggaccccgac catctcggta 
gggttcctgc atgaggcggg ggagttcgat 
tccgcgggag gcgttggcga ccgattccca 
cgtcgtggga ggcgatcgag cgggccggta 
ggtagtcgga cgggtgtgtt cgcgggggtg 
gatgttggcg agtccggagt tcgagggttt 
cgagtttggc gtcgggtcgg gtggcctaca 
gcggtgacgg tggatacggc gtgttcgtcg 
ggcgatgcag gcgttgcgta gtggtgagtg 
gtgtgacggt gatgtcgacg cctgcggtgt 
cggggtttgt cgccggatgg tcggtgcaag 
tggtgtgggc tggtccgagg gcgtcggcgt 
cggacgcggt gcgcaatggt cacgagattt 
gcggtcaacc aggatggtgc gtccaatggt 
gtcgcagcag cgggtgatcc ggcaggcgtt 
ccggtgacgt ggatgtggtg gaggcgcatg 
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99951 gtacgggtac gacgctcggt 
100001 acgtatgggc gggatcgtga 
100051 gaagtcgaat ctggggcata 
100101 tcaagatggt gttggcgatg 
5 100151 gtggatgcgc cttcttcgca 
100201 gctcagtgag caggcggctt 
100251 gtgtctcctc cttcggcatc 
100301 cgtccggagg ccgcgcggcg 
100351 gtccacggtg ccgtgggtgc 
10 100401 cccaggcggc gaaactgttg 
100451 ctcgtcgacg tgggaatgtc 
100501 ccgtgcggtg gtactcgccg 
100551 cggcgattgc tgccgacgag 
100601 ggcgcgggtc gtcacgcggt 
15 100651 gggcatgggg cgtgagttgt 
100701 tcgatgtcgt ggtcgaccac 
100751 ttgcgtgagg tgatgtgggg 
100801 ttggacgcag ccggctctgt 
100851 tggagagttg gggtgtccgt 
2 0 100901 gagatcgcgg cggcgcatgt 
100951 ccgcctggtg gccgcccgtg 
101001 gtgcgatgat cgcggtccag 
101051 accgacgacg tagcgatcgc 
101101 atcgggtgtg gaggatgccg 
2 5 101151 aggggcgtcg cacgacccga 
101201 ttgatggatc cgatgctggc 
101251 ctatgctgct ccgtccctcc 
101301 ccacggccga cgaactgtgc 
101351 gaggcggtcc gcttcgccga 
30 101401 gcggaccttc ctggaactcg 
101451 gagcctcgct caccgaatcc 
101501 cggccggagg aaccggcggc 
101551 cggcgcgcgc gtcgattggc 
101601 gggtggagtt gccgacgtat 
35 101651 ggtcgggttg gtgttggtgg 
101701 ggggcatccg ttgttgggtg 
101751 tggtgttgac gggtcgtctg 



gatccgatcg aggcgcaggc gttgttggcg 
gcctgagcgg ccgttgttgt tgggttcggt 
cgcaggctgc tgcgggtgtg gcgggcgtta 
cggcatggtg tggtgccgcg gacgttgcat 
tgtggactgg tccgagggtg cggtggagct 
ggccggagac gggtcgggtg cggcgggcgg 
agcggcacca acgcgcacgt gattctggag 
tccggtgatg gagacgaaca ccgtggagcc 
tttccggcaa gacaccggag gccctgcgcg 
tcgtcgatcg aggaacgccc ggagcttcgc 
cctggtcacc ggccgctcga cctttgaaca 
ccgaccgtgc cgacgccgcc cgcgcattgt 
gccgatgctg ccgctgccac cggccgcgtc 
gctgttctcg ggtcagggtg ctcaacgtct 
acgagcgttt cccggtcttc gccgaggctc 
ctggacgccg ccttgcccgc ccaggccggt 
cgacgatgcg gagttgctga acgagacggg 
tcgccatcga ggtggcgctg ttccggctgg 
ccggacttcg tggccggtca ttccatcggt 
cgccggggtg ttctcgctgg aggacgcctg 
cgacgctgat gcaggcgttg ccggccggcg 
gcgaccgagg acgaagtcac cccgcatctg 
cgccatcaac gggccgaacg cactggtcgt 
ccgtcgagat cggggcgcgg ttcgcggccg 
ctccatgtgt cgcatgcgtt ccactcgccg 
ggagttccgt gtggtggcgg agggcctgtc 
ccgtcgtctc gaatctgacg ggccaggtgg 
tcggccgagt actgggtgcg ccacgtccgc 
cggggtgacg gccctcgaag ccgagggcgt 
gcccggacgg cgtcctcgcc gccatggcag 
tccctcgcgg taccgctgct ccgtaaggac 
actcgccgcc ctggcccagt tgcacatcgc 
ccgtgctctt cgctggtgtg ggtgcggggc 
gcgttccagc gtgggtggtt ctggccggtt 
tgatgtgggt gctgtggggc ttgggtctgc 
ctgcggtgga gttggctgcg ggtgcggggg 
tcgttgtcgt cgcatggttg gttggctgat 
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101801 catgcggtga tggggcgggt 

101851 ggtgatgcgt gctggtgatg 

101901 cgttggcggc gccgttggtg 

101951 gttgctgtgg atgctcctga 

5 102001 ttcgtgtcct gatggtgtgg 

102051 gtgtgttggc ctctggtgtg 

102101 ggtgtgtggc cgccgcaggg 

102151 cgagctgttt gcggatgctg 

102201 tgcgtgcggt gtggcgtcgc 

10 102251 tcggacgagg ttgctgagag 

102301 cccggcgttg ctggatgcct 

102351 aaggtcaatc cgccgatggt 

102401 gtttccctct tcgcctcggg 

102451 ggcgggtgag catgcggtgt 

15 102501 cggtgatttc catcgacgcg 

102551 gtcaacgcat ctcacaccca 

102601 gaccacggtc ccgagcaccc 

102651 tcggaaccga ccacctgggg 

102701 ggcgcgacca cgaccacgac 

20 102751 gatcgcggcg gggcccgaag 

102801 tcaccaccga ggacgccatc 

102851 gtggccggcc aaggcaccat 

102901 tcgccgcctc accgccgagg 

102951 acgagcgtct ggcggcaagg 

2 5 103001 gacgggcagg atgtcgccgc 

103051 ccagaccgag aacccgggca 

103101 aggcgtcgac cgcggtcctc 

103151 ctgcttctgc gcgacggaca 

103201 gtcgcccgcc gacacggccg 

30 103251 tgctgatcac cggtggtacc 

103301 ctcgtcgaca ggtacggcgt 

103351 ccccgatgcc ccgggaacca 

103401 gtgccgaggt ggccgtgcag 

103451 gcggcgttgg tcgccggcgt 

3 5 103501 ccacacggcc ggtgtgttgg 

103551 agcggctcgc caccgtcctg 

103601 cacgaagcga cccgcggcct 



gtttgttcct ggtacggcgt tgctggagat 
aggtggggtg tggtcgtgtt gaggagctga 
ttgcctgagc gtggtggggt gcgggttcag 
tgctgcgggt cgtcgtggtg tgggggtgta 
gtcaggcggt gtggtcgcag catgctgtcg 
gctgaccagg tcggtgggtt cggtgacggt 
tgcggtgtcg gtggatgctg agggctgcta 
ggttcggtta tggcccggtg ttccaggggt 
ggcgaggaac tcttcgcaga ggtcgccctg 
cgctgatacg gcgaccggtt tcgggttgca 
cgctccatgc ctcgcttctc tcctcccttg 
gggcctgcgt tgccgttcgc gtgggagggt 
tgcgacggct ttgcgcgtgc ggttggcgcc 
cggtgaccgc ggtggatccg accggtgcgc 
cttcgtaccc gtcgcctcac cctcgatgag 
gctgagcgat gcgctcttcg gcgtccaatg 
cggccgccga ccacccgtcg gtagccatca 
ctcgccgaag cgctcagcag ttcctctgct 
cgcggccgcg tacgagagcc ttgacgcgct 
tgtccgtccc tgacgtcaca ctcatcggtc 
gctcagtacg tgaacgacca cgacgccacg 
cggagccggc gcggcggccg tagacgcggc 
ccctgcgcac gatccaggca tggttggccg 
cgcctggtct tcgtgacccg cggcgcggcc 
ggcggccgtc cagggcctgg tgcgctccgc 
ccttcggcct cctcgacctc gacgggaccg 
ggcgaggctc tcacctccga cgaaccgcaa 
cctgcacgcc gccaggctga cccgcctggc 
tgcccacgga gtggaacgcg gacggcacgg 
ggcgggctcg gcgcgcagtt cgcacggcac 
ccgcaatctc ctgctcgtca gccggcgcgg 
cggagttggt cgccgagctg acggcgcacg 
gcatgtgacg tggccgatgg cgatgcggtg 
gccggatgag cacccgctga gggcggtcgt 
acgacggagt gatcggctcg ctcaccgagg 
cggcccaagg cggatgccgc ctggcatctg 
ggacctggac gcgttcgtcg tcttctcctc 
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103651 cgtcgcaggg gtcttcggcg 

103701 acgccttcct ggatgcgctg 

103751 ggactctcgt tggcctgggg 

103801 catgctgtcg gacgccgagg 

5 103851 cgctctccgc ggagcagggc 

103901 gccggaacca gcacgccaga 

103951 gtcgggaacc ggcgacacga 

104001 cggtccggct cgacctggcg 

104051 attctgcgcg gactggtgcg 

10 104101 ttccgtcacc gtggccggac 

104151 acgagcggcg ccaggaactt 

104201 gtcctcgggc acgccgatcc 

104251 tgacctcggc ttcgactcgc 

104301 gcacggccac cggcctgcgc 

15 104351 aacaccgatg ccctcgcgga 

104401 cgagagcgag gtgcgggtgc 

104451 atcccatcgt ggtggtgggc 

104501 tcgcccgagg acctgtggcg 

104 551 caccttcccg accaaccgcg 

2 0 104 601 accccgcaca cctcggcacc 

104651 gaggcggggg agttcgaccc 

104701 gctggcaacc gactcccaac 

104751 cgatcgagcg ggccggcatc 

104801 ggcgtcttcg ccggcgtgat 

25 104851 caaggagttc gagggcttcc 

104901 cgggccgcgt ctcctacacc 

104951 gataccgcct gctcctcctc 

105001 . ccttcgggcg ggtgagtgca 

105051 tgtccacgcc aggcacgttc 

30 105101 cctgatggtc gttccaaggc 

105151 gtccgagggc gtcggcatcc 

105201 gcaacggcca cgagatcctc 

105251 gacggtgcgt ccaacggcct 

105301 cgtcatccgt caggcgttgg 

35 105351 acgccgtcga ggcgcacggc 

105401 gcccaggcgc tcctggccac 

105451 gctgctgctc ggttcgatca 



gcgccggcca ggccaactac gccgcggcca 
atggcccagc gccgggcagc gggcctgccc 
gccgtgggac cagaccggcg gaatgacggg 
ccgaccgcct cgcccgctcc ggcatcccgc 
ctcgccctct tcgacgcggc actcgctctt 
cagggcagcc ggcagcgccg ccgccagcac 
tcgccatccc ggccgcggcc ctcgtcgcac 
gcgctggccg cgcagggtga ggtccccgcg 
gacccgcacc cgccgtacgg cggccggcgg 
tcgtcaaccg cctgtccggg ctcaccgccg 
ctcgaactgg tccgcactca ggcagccctc 
ggcgtccgtg gactccaccg cacagttccg 
tgaccgccgt cgagctgcgc aaccggctga 
ctgaccgcaa ccctggtctt cgactacccg 
gcacctgcgg gacgagttgt tcggcgcggt 
cggtccaggc actgccgccg accgccgacg 
atggcctgcc gtttccccgg tggtgtgacc 
cctggtcgac gccggcaccg acgccatcac 
gctgggacct cgaatcgctc tacgacccgg 
tcctacaccc gctccggtgg cttcctgcac 
ggcgttcttc ggaatgagcc cgcgtgaggc 
agcgtctcct gctggaatcc tcctgggagg 
gacccgctga ccctgcgcgg cagcgccacc 
gtacagcgac tacgggagca tcctcggcgg 
aaggccaggg aagtgcgggc agcgtggcct 
ctcggcttcg aaggtcccgc cgtcaccgtg 
cctggtcgcc ctgcacctgg cagcccaggc 
cgctcgcgct cgccggtggt gtgacggtga 
gtggagttct cgcggcagcg gggtctggcg 
gttcgccgag gccgcggacg gcgtcggctg 
tcgtcctgga gcgccagtcg gacgccgtgc 
gccgtgatcc gcggctcggc ggtcaaccag 
gaccgcgccc aacggcccgt cccagcagcg 
ccagtggcgg cctgtccacg gccgacgtgg 
acgggtacga cgctcggtga cccgatcgag 
ctacggccgc gaccgcgacc ccgagaaccc 
agtccaacct cggccacacc caggcagcgg 
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105501 ccggtgtcgc cggtgtcatc aagatggtca tggcgatgcg gcacggcgtg 
105551 ctgccgcaga ccctgcatgt cgacgcgccg tcctcgcacg tcgattggag 
105601 cgtcggcgcc gtcgaactgc tcaccgagca gaccgtctgg ccggagaccg 
105651 gccgggtccg tcgcgccggt gtctcctcct tcggcatcag cggcaccaac 
5 105701 gcccacgtca tcctggaaca gccggaggcc gtgcagcgcc tggcaccggg 
105751 agcagcagag accgtggagc cggtcgccat caagccgtcg gcggaaccgt 
105801 ccctggtgcc gtgggcgctg tccggcaagt cacccgaggc cctgcgcgcc 
105851 caggccgcac gcctccgtga cttcctggcg gaacggcccg aaccgcgctc 
105901 gatcgacatc ggccactcac tggccgtcac acgctcgcag ttcgaccacc 
10 105951 gcgcgatcgt gctggtcgac gatgcgaagg ccccggcgga cagcctggcc 
106001 gccctcgcgg ccctggcctc cggtgtggcc gatcccgccg tcgtctccga 
106051 cgcggtatcg accggcggtt cggcagtgct gttcacaggt cagggtgctc 
106101 aacgtctggg catggggcgt gagttgtacg gccgtttccc ggtcttcgcc 
106151 gaggctctcg atgtcgtggt cgaccacctg gacgccgcct tgcccgccca 
15 106201 ggccggtttg cgtgaggtga tgtggggcga cgatgtcgag ttgctgaacg 
106251 agacgggttg gacccagccc gcgctcttcg ccgtcgaggt ggcgctgttc 
106301 cggctggtgg agcgttgggg tgtccgtccg gacttcgtgg ccggtcattc 
106351 catcggtgag atcgcggcgg cgcatgtcgc cggggtgttc tcgctggagg 
1064 01 acgcctgccg tctggtggcc gcccgtgcga cgctgatgca ggcgctgccg 
20 1064 51 accggcggcg cgatgatcgc ggtccaggcc accgaggacg aagtgacccc 

106501 gcacctgacc gacgaggtgg cgatcgcggc cgtcaacggc ccgacctccg 
106551 tggtgatctc gggcgcagaa gaggccacgc agaccgtggc acaacacttc 
106601 gccgaccagg ggcgccggac gaccgcgctg cgggtctcgc atgcgttcca 
106651 ctcgccgctg atggatccga tgctggcgga gttccgtgcg gtggcggaag 
25 106701 gactgtccta cgccaccccg tccctccccg tcgtctcgaa tctgacgggc 
106751 tggctggcca cggccgacga actgtgctcg gccgagtact gggtgcgcca 
106801 cgtccgcgag gcggtccgct tcgccgacgg catcaccacc ctcgaagccg 
106851 agggcgtgcg gaccttcctg gaactcggcc cggacggcat cctgtccgcg 
106901 ctggctcagc agtccctcgc cggcgaagcc gtcaccgtgc ccgtcctgcg 
30 106951 caaggaccgc ggtgaggagt ccacggccct gacggcccga gcgcatctcc 

107001 acacccgcgg actgatcgaa gactggcagg acttcttcgc tggtgtgggt 
107051 gcggggcggg tggagttgcc gacgtatgcg ttccagcgtg ggtggttctg 
107101 gccggttggt cgggttggtg ttggtggtga tgtgggtgct gtggggcttg 
107151 ggtctgcggg gcatccgttg ttgggtgctg cggtggagtt ggctgcgggt 
3 5 107201 gcgggggtgg tgttgacggg tcgtctgtcg ttgtcgtcgc atggttggtt 
107251 ggctgatcat gcggtgatgg ggcgggtgtt tgctcctggt acggcgttgc 
107301 tggagatggt gatgcgtgct ggtgatgagg tggggtgtgg tcgtgttgag 
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107351 


gaactgacgt 


tggcggcgcc 




107401 


ggttcaggtt 


gctgtggatg 




107451 


gggtgtattc 


gtgtcctgat 




107501 


gctgtcggtg 


tgttggcctc 


5 


107551 


tgacggtggt 


gtgtggccgc 




107601 


gctgctacga 


gctgtttgcg 




107651 


caggggttgc 


gtgcggtgtg 




107701 


cgccctgtcg 


gacgaggttg 




107751 


ggttgcaccc 


ggcgttgctg 


10 


107801 


tcccttgaag 


gtcaatccgc 




107851 


ggagggtgtt 


tccctcttcg 




107901 


tggcgccggc 


gggtgagcat 




107951 


ggtgcgccgg 


tgatttccat 




108001 


cgatgaggtc 


aacgcatccc 


15 


108051 


tccaatggac 


cacggtcccg 




108101 


gccatcatcg 


gaaccgaccc 




108151 


cttgcccctg 


gtcgaggagc 




108201 


agcacccggt 


accggacctc 




108251 


acaggcgtac 


ctgcggacgc 


20 


108301 


catgctccga 


tccgtgcgtg 




108351 


agcagtggtt 


ggcggacgac 




108401 


acgcgcgggg 


cggtttccgt 




108451 


ctcggccgtc 


tggggtctgg 




108501 


gcttcggtct 


tctcgacctc 


25 


108551 


gcccccgagg 


tcgacatcga 




108601 


gaccgtgcag 


cccgcgctcg 




108651 


cgcagttggc 


actgcgcggc 




108701 


atccccgcgc 


cgcagaccga 




108751 


ccgtccggag 


atcgacaccc 


30 


108801 


gtaccggtgg 


cctcggtggg 




108851 


ggggtacgga 


gcctggtgct 




108901 


agcggagaag 


ctggtcgccg 




108951 


tgcagacgtg 


tgatgtggcc 




109001 


ggcgtgtcgg 


acgagtaccc 


35 


109051 


gttggacgac 


ggagtgatcg 




109101 


tcctgcggcc 


caaggcggat 




109151 


gatctggacc 


tggacgcgtt 



gttggtgttg cctgagcgtg gtggggtgcg 
ctcctgatgc tgcgggtcgt cgtggtgtgg 
ggtgtgggtc aggcggtgtg gtcgcagcat 
tggtgcggct gaccaggtcg gtgggttcgg 
cgcagggtgc ggtgtcggtg gatgctgagg 
gatgctgggt tcggttatgg cccggtgttc 
gcgtcgcggc gaggaactct tcgcagaggt 
ctgagagcgc tgatacggcg accggtttcg 
gatgcctcgc tccatgcctc gcttctctcc 
cgatggtggg cctgcgttgc cgttcgcgtg 
cctcgggtgc gacggctttg cgcgtgcggt 
gcggtgtcgg tgaccgcggt ggatccgacc 
cgacgcgctt cgtacccgtc gcctcaccct 
acacccagct gagcgatgcg ctcttcggcg 
agcaccccgg ccgccgacca cccgtcggta 
cttcggcctc gcagacggcc tttcggacgc 
gcggtgacct cgcggcgctc gcagcgtcgg 
gtcctcgtcc cggtagcggg cacccggcgc 
cgaaggacac accgacgccg ggacatccga 
aggccaccgc acaggtactg gagcagatcc 
cggttcgagg cggcgcggct ggtgttcgtg 
gggtgagggc ggcatcgccg acctggcggc 
tgcggtcggc gcagtcggag aatccgggct 
gacctcgacc tcgcccttga ctccgacctt 
gcgcgaccgt gaccgcgatc cggtcggtgg 
ccgcggccct gcacgcgacc gccgacgagc 
gggaccgtgc aggccgcccg actgacccga 
ccgtgccgag accgaccctg ccgagaccga 
ggcggcccgg cacggtgctc atcaccggtg 
ttgctcgccc ggcacctcgt cgccgagcgg 
cgccagccgg agcggtctcg cggccgaggg 
acctcgaagc gctcggtgcc gtggtggccg 
gatggcgatg cggtggcggc gttggtcgcc 
gctgacggcg gtcgtccaca cggccggtgt 
gctcgctcac cgaggagcgg ctcgccaccg 
gccgcctggc atctgcacga ggcgacccgc 
cgtcgtcttc tcctccctcg ccggcgtcct 
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109201 cggtggcgcc ggtcaggcca actacgcggc ggcgaacacg ttcctggacg 

109251 ccttgatggc gcagcgtcgc gccgccgggt tgccgggtgt gtcgctggcg 

109301 tiggggtccgt gggaccgggc cggcggcatg acggggaccc tgtcggacgc 

109351 cgaggccgac cgcctcgccc gctccggtgt tccgccgatc tcggcggagc 

5 109401 agggccttgc gctgtacgac gcggcgaccg ccggtgagcg gccgctggtg 

109451 gtgccggtgc ggctggacct cgccgcgctc cgcgggctcg gtgatgtccc 

109501 ggcgctgctg cgcggactgg tccggacgcc cgcgcggcgg accgcggcgg 

109551 ccggtgcggc gccgtcggcc gatgtgctca cccggcagtt ggccgggctc 

109601 ggcggggcgg agcaggagga ggtcctgctg aggctggtgc gcggtcaggc 

10 109651 cgcggtggtg ctcgggcacg ccgacggctc ggcgatcggt gcggggcgac 

109701 agttccagga gttgggcttc gactcgctga ccgcggtgga gttccgcaac 

109751 cgactcaacg cggccaccgg actgcggctg ccggccaccc tgctgttcga 

109801 ctacccgacg ccggccgacg tcgtcgggca cctgcgcggc cggctcggca 

109851 ccggggaggt gtcgggtgcg ggctcggtgc tggcggcgct ggacaacctt 

15 109901 gaggcggtga tcgccgggct gtccctcgac gacgcggggg agcaccagtt 

109951 ggtggccggc cggctggagg tcctcagggc gaagtgggcg gacatgcgaa 

110001 gcgcggaggg agctgtggac ggcggtgcgg acgtcgacat cgaggaggcg 

110051 tcggacgacg acatgttcgc gctgctggac gacgagctgg ggctgaactg 

110101 agccgctccg catgagcagt tccgcaccag aagttccaca gtggcctggt 

20 110151 ccacagtgac ctggtccaca gtgagctgtt ccacagcgac ctgttccgcg 

110201 gcggttccgt agtgagccgt tccgcttagg cgggaaaccg gggccccggt 

110251 ggtcggacgc ttgattccgg ccaccgggag cgtcaaaccg gcctcttcta 

110301 aagaaaggga aagaactagg gaaacaccca cagcccgatc atgtgcaatg 

110351 aattccgtgg ctcggaaaac cattcgcggg ccatggagtt acggtgtgat 

25 110401 cacgtgtccg tacggcagtg acggcgtgca cgggttgcca cggcaggtcc 

110451 ccggccgagg cgtctccccg ctgctgcccg gcggtgcgtg ccgcttggtt 

110501 cgtccctagg aggtcccccc atgaccacgt ccaccgagga gagcctgtgg 

110551 gcccggtgct tccatccagc accggccgcc cccgtccggc tcttctgttt 

110601 cccacatgcg ggcggctcgg cctccttcta cttcccggtg tcggcccaac 

30 110651 tgtcctcggt tgccgaggtg ttcgccatcc agtacccggg gcgccaggac 

110701 cggcgcaagg aagccggtgt cagtgacctc gcgaccttgg ccgaccaggt 

110751 ctacgacgcg ctgcgccccc tgctgaagga gcggccgagc acgttcttcg 

110801 ggcacagcat gggcgcgacg ctggccttcg aggtggcccg gcgcttcgag 

110851 gccgacgacg gtgacctggt ccggctgttc gcctccgggc gccgggcccc 

35 110901 ctcccgcgtg cgtgaagagg ccgtgcaccg gcggtccgac gacggcatcg 

110951 tcgaggagct gaagctgctc gccggcacca acaccgcgct gctcggcgac 

111001 gaggagatcc tgcggatgat cctgcccgcg atccgcagcg actaccaggc 
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111051 catcgagacc taccgctgcc 

111101 ccgtcctcac cggcgaccgc 

111151 gcgtggcgcg gccacaccac 

111201 tgggcacttc ttcgtcagct 

5 111251 gggcgcacct cgccggcaac 

111301 gtgccggtct gccgcacggc 

111351 cccggcgacg cgcgttgcgt 

111401 cgcgggcgcg cggaggacgt 

111451 cattcggaaa acgtcaaatg 

10 111501 cgcactatat gatcttgtgg 

111551 gctcctggtg catccatccc 

111601 cgtggtcctg gtacgcgtcg 

111651 gtggaggaac gcgtgatgcg 

111701 cacgctggtg ggacgcgacg 

15 111751 cggccgcccg cgacgggcgg 

111801 ggcatgggca agaccagtct 

111851 ccgcggcatg acggtgctgt 

111901 ccgggtacgg cggtgtgcgc 

111951 ggcgacgccc ggcgctcgcc 

20 112001 gcccgcactc accgcagacc 

112051 acccggtgct gcacggcctg 

112101 cggccgctgg tcctcgtcct 

112151 cctggcctgg atcgacttcc 

112201 tggtcgtcct ggcctggcgc 

25 112251 ctcgccgaca tcgccgccca 

112301 gctcggcccc gacgacatag 

112351 cggccgcacc gtcgttcgtc 

112401 ccgctggccc tcgcccgcct 

112451 gccggacgcc gccggggagc 

30 112501 tcgcccgctc ggtgcgctgc 

112551 ggcgtggccc gtgccatcgc 

112601 ggcggcgctc gccggcgtcc 

112651 tgctgcgcag ggccggcatc 

112701 gacgtcgtcc gctccgcggt 

35 112751 cgaactgcgc accaacgccg 

112801 ccgaggagct cgccggccag 

112851 tggatggccg cggtgctgcg 



cgcccgacgt caccgtccgg gcgccgctga 
gacccgaaga cctccctgga cgaggccgag 
cggggacttc gacctcaagg tgcttcccgg 
ccgaggcccc ggcgatcatc gatctgctcc 
ggctagcggg cgcactgcgg caggccggcg 
ccccgcgccg cctgagacgg caccatgcca 
gcgcctgtgg agcaccgtcc cgcgtgcgta 
gctcggccgg ccggatgtcc tggaggagac 
tctgacagct gctgctcatt tactccagcc 
tggggcatgt acggagcgtg ccactcgttc 
gaccgtccgg atttgtgtgc accggacggc 
gcgcgaggtc ggctcctgtt ctcaccccac 
gaagcagtcg ggttcttccg gcttactgac 
acgaactgcg caccctcgcc cggcacgccg 
gccggcctgg tcctgctcca cgggcccgcc 
gctgcggtcc ttcacggcga gcgatgtctg 
acggcacctg cggcgagacc gtcgccggcg 
gaactcctcg gcgggctcgg cctgagcggc 
cctcctggag ggcctggcgg cccgcgcgct 
ccgccggtcc cgacgccgcc acgggtgcct 
tactggctcg ccgcccgcct catggcccaa 
cgacgacgtc cactggtgcg acgaacgctc 
tgctgcgacg cgccgaggac ctgccgctgc 
agcgaggccg aaccggtcgc gcccgcggtg 
gcgccgcccc accgtgctcg gcctgcaccc 
gcgaaatggt gcgtcgcgtc ttccggacca 
agccgggtcg ccgccgtgtc cggcggcaat 
cctcgacgaa ctccgcgccg agggcgtccg 
gccgggccgc cgaggtcggc agtcacgtcc 
ctgttggagc gccggccgcc ctgggtgcgc 
cgtactcggc ccggagtgca ccgagttgct 
cggccgcgac cgtcgacgag gccctgttgg 
ctggccgccg accgcgtgga cttcgtccat 
gctcgacgac gtcgccccgc ccaccctggc 
cgctgttgct gagcgacgcc ggccgcccct 
ctcatgctgc tgccggtgct cgaccagccg 
cgacgccgcc gcccaggcgg agagccgcgg 
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112901 cgccccggag gccggtgtgc 

112951 cggacaacgt tgccgtccgc 

113001 aacccgcccg aggcgatgcg 

113051 ggacgtccgc acccgcgccc 

5 113101 tcgccgtgca ggaatccccg 

113151 gccgagctga ccgccgaact 

113201 gttgcggacc ctcgtggagt 

113251 aggtgacgat cggtgcggtc 

113301 cccggcgaca caccggccca 

10 113351 gaccgcgatg gacggccggg 

113401 gcgccctgcg cgcacccggc 

113451 gcctccttcg ccctctccct 

113501 actggacctc atgctccagt 

113551 acgtcctggc gctctccacg 

15 113601 ttccccgagg ccctggccga 

113651 ggagcgctgg gcggacggcg 

113701 ccctcgtcga ccgcggcgag 

113751 atcacccgcc cccgcctcga 

113801 ccaggcccgc gcctacgccc 

20 113851 tggacctcct tctcgcctgc 

113901 aacccggcgt tcgtgccgtg 

113951 cctggaccgc cacgaccagg 

114001 tggccgagcg ttgggggacg 

114051 cagggcgtcg ccgcacccgg 

25 114101 ggtctcgctg ctcgcggact 

114151 aacttcttct cggacacgcc 

114201 cgggaacacc tgcgcgccgc 

114251 gaagctcggc gtcgacgcca 

114301 tacgcaggat gaccgcctcc 

30 114351 acggtggcgg acctcgcggt 

114401 agccctcttc gtgactgtaa 

114451 accggaagct cggggtcggc 

114501 accaggaccg ccacctccgg 

114 551 acgcggacgc gcttgagacg 

35 114601 caaggcgcgg aaccaaccga 

114651 cccacggcga ccccaccatg 

114701 atcggcgacc tcctgcaccg 



gctgcctcta ccgggtgttg gaggtggagc 
atccagatgg cccgcgcgct cgccgagatc 
cctcctcaag gaagcgctct ccctcgccgg 
aggtcgccgt ccagtacggc ttcacctgcc 
tccggggtgc ggatgctgga ggacgcgctc 
gggccccgaa ccagggcccg tggaccggga 
ccgtgctgct catcgtcggg gccgacgaga 
cgcgaccggg cggcccggct caccatgccg 
gcggcagatg ctggccatga ccaccgtgct 
acgcccggtc ggccgtcgac caggcccgcc 
gtcgagctgg aaccctggtc gctgctgtcc 
ggccgacgag gtcgccgacg cgcagtacgc 
acggccagga caacgcggcg gtgtggacgt 
cgcgccctgc tccaccacgg ggtgggcgcc 
cgcccagacc gccgtcgaga tcctcggcga 
ccgtgctgcc ccgtgtcgcg ctggccaccg 
cccgagcgcg ccgaacacgt cctcgacggc 
acgcttcgtc atcgaatacc actggtacct 
gctgggtccg cggggatttc caaggagccc 
ggtcggtccc tggaggagtc gcgcttcagc 
gtgggccgat ggcgcggtgc tcctggcgac 
cgcgcgaact cgccgcatac ggaagcgagt 
gcgcgcggcc tcggactggc cttcatggcc 
ccgcgccggc atcgatcacc tcaccgaggc 
ccccggcccg ggccatggag gcccgggccg 
cacctgaagc gcgacgacct gcgggccgcc 
cgccgacctc gcccagcgct gcggcgccgt 
gaaaactgct ggtcaccgcg ggtggtcggg 
ccactcgaca tgctgaccgg gatggaacgc 
gaccggcgcg agcaaccggg ccatcgcgga 
ggaccatcga aacccatctc acgagcgtct 
gggcgtgcgg agctgtccgc cgtcctggag 
tcggcagccg ccggcctggg tctcccaggc 
aacaggacga gaggtagcgg tgccccggag 
ccacctgcac gccgcagtgc gcacccgacg 
ctcctcgaat gcggccggga acagcggctc 
cctcggccag gggcggccat cggtgctcag 
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114751 cctgaccggc cggcccgggc acgcccagaa cgccctggtc cgctggggcg 

114801 cgtgccgggc caggcacgac gggctgcgcg tcctgcgagc ccaggcgacg 

114851 cccgcggaac gggaactccg ctacggcgcc gttctccaac tgctggccgt 

114901 cctcgacggc ccgcacggca gcaccctgga cgccgcgatc cgccacgacg 

114 951 gtcccccgcc actgcccgtg cccggcatcg aggaggtgct gcggcgcacc 

115001 ggcacggcac ccaccctggt cgtggtcgaa gacgtccagt ggttggaccc 

115051 ggcctcgctg acgtggttgc agatcctgct gcgccacctc gggccggaca 

115101 ccccgctcgc cgtcctggcc agcagctgcg gtgacaccac ggccttcgac 

115151 accgacccga aggcccccgc cgtcccgggg ccgccggaca ccgtgcccgt 

115201 cgcgcgcttc gtggtgcccg cgctcaccga ccgcggggtc gccgccaccg 

115251 tccgcgccgt ctgcggcacc cccggcgacg aggagttcat cgccgcgctc 

115301 acctccgcca ccgccggcaa ccccgccatc ctgcgggacg ccctgcgcgc 

115351 cttcgtcgac cacggcctcc ccgccgacgc cgaccacctc ccggagctgc 

115401 acgccctcac cgctggcgtc gtcggcgacc acaccgtgcg cgccctggac 

115451 ggcctgcccg ccgaagtcaa cgccgtcctg cgggccctgg ccgtctgcgg 

115501 cgacctgctc gacttccacc gagtccgggc cctcgccggc gcgcactcgc 

115551 tgtccgagga ccggatccgc accctgctgg cgagcgtcgg cctgaccgtg 

115601 tccgtcggcg acaaggtgca catccgcttc cccgcctcca aggcacgcgt 

115651 catcgaggac atgcccgccg cggagcgcgc cgatctgtac gtccgcgcgg 

115701 ccgaactcac ccacagttgc ggcgtcaacg acgaggacgt cgcccatctg 

115751 ctgctgcgct cgtcgccgct cggcgcaccc tgggtcgtgc ccctgctccg 

115801 ccgcggattc gccgccgcgc tgcgccggga ggaccaccac cgggcctgtg 

115851 cctgcctctc ccgcgccctg caggaacccc tcgacccccg ggaacgcagc 

115901 ctgctgacgt tggaactggc cgcggccgaa gccgtcgccc ggccggaggc 

115951 gggggatcga cgcctggggg aactcgtccg cagcaccgtc gcggacaccg 

116001 accccacgtc gtccggtgag ggggtggggg tccgcgccat cgacctgggg 

116051 ttcgcccggg gcaacagcga atgggtccgc cgcaccgcgg gcgaggccct 

116101 gccgtacgcc gggccggccg accgggagga actggtcgcg ctgttctggt 

116151 tggccgccgt gcgggacgac gacgcgccga tgatccccgt ggtgccccgg 

116201 ttgcccgacc ggccggtgcc gccggcccag gccggcgccc gtgcctggca 

116251 gctggccacg gcgggggagg acgcggacaa ggccaggaag ctcgcccgga 

116301 tcgccctcac cggcggggtg aacgagagcc tgatgatgcc gaaactggcg 

116351 gcctgcgccg cgctgttcgc caccgacgac aacgacgagg cggtgcacgg 

116401 cctggacacc atgctcaccg ccgcccgcag tgcccacctg cgcagcatgg 

1164 51 ccgcccgcat tttcaaccta cgggcccgga tacacctgtg cgcggcccgg 

116501 ctggaggccg ccgaacgcga tctggacagc gccgagcgcg ccctgccgcc 

116551 gacgagttgg cacccccgtg cgctgcccaa cctgatcgcc acccgcatcc 
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116601 tcgtcagcat ggagacgggc cgcccggacc gggcccgccg actcgccgag 

116651 gccccggtcc ccgccggcgg cgaggagggt gtgtggtggc ccgccctgct 

116701 gctcgcccgc gcccgtgtgg ccgccgacga cggtgactgg gaggaggccc 

116751 tgcggctgtc gcgggagtgc gggcgctggc tttttcgccg gcactgggcc 

5 116801 aacccggcca tgctcagttg gcgcccgctg gccgccgagg cgtgtctgaa 

116851 gctcggcgac gtgacggagg cgcgccggct gcgggacgag gagctgttct 

116901 tcgccgaccg ctggggcacc gcgagcgccc gcgggatcgc ccgcctgacg 

116951 acgcggcgac tcttcgacga cgacggcgac cgggccgtcc ggcggatccg 

117001 cgaggccgcc gccctgctcc gcgactcgcc cgcccgcctg gcctacctgt 

10 117051 ggagccggct gagccaggcc ggtgccgaga cggcccacgg cgacaccgcc 

117101 gcggccgcac gctcctggca ggcggtcgcc cggatgaccg ccgcccaccc 

117151 cgccagccgc ctcgccaccg ccgcccgcac cctgaccgtc ccgtccgttc 

117201 cggtcgccac cgcgccgccc accgccgtcg tcccacccgg atggcgcgac 

117251 ctgtccgagg cggagaagga caccgtgctg ctcgccgccc gcggccacgg 

15 117301 caaccgccag atcgccgaac aactcgccgt cagcaggcgc accgtggagc 

117351 tccggctgag caacgcctac cgcaagctga ggatcggcgg acgcaaggag 

117401 ctgtacctgc tcctggaggc gctggaagga ccggtcgcgg atgcttcttg 

1174 51 agcgggagaa cgaactggcc cggatccggg ccgccctgga cgccgcggaa 

117501 gcgggcgact cctcgctcct cctgatcaac ggtcccctcg gcagcgggcg 

20 117551 ttcggcgctg ctgcgccgga taccggagct ggccggcgac ggcacccgcg 

117601 tcctgcgggc cagcgccgcc tggcgggaac gcgacttccc cttcgggatc 

117651 gcccgccaac tcttcgacca cctgctgtcc ggggcgggcg gcgcagggcc 

117701 ggccgaacgc accgccgggg cagagcactt cagccgactg atggacaccg 

117751 gcgaccgccc taccgggacc ggccccgccc tggaggtctc ccaggcagtg 

2 5 117801 ctccagggcg cccaggcgct gctcgccgac gcgtccgcgg agcggcgcct 

117851 gctgatcctg gtcgacgacc tccagtgggc cgacggcccg tccctgcgct 

117901 ggctggccca cctcacccgg cggctgcacg gcctgcgggc gctgctggtg 

117951 . tgcacgctgg ccgacggcga ccaccggggc aggtaccccc tggtccggga 

118001 ggtcgccggc gccgcgcaca ccgtcctgcg cctggcgccg ctgtcccggg 

30 118051 acgccacccg cgtcctgctc gccgggcccc agggccggcc gccgcaggac 

118101 gcactggtgc gcgccgtgta cgaggcgtcc aggggcaacc cgctgttcct 

118151 gaccgccttc cggagcgctc tgcgcgccac cggaaggccg cccggcggcg 

118201 accacttcgg cgccgtccgg gagctgagcc cgacggtgct gcgcgatcgg 

118251 ctcgcgggcc atctgcggat ccagccgcag ccggtgcgcg aggtcgcggt 

35 118301 ggcggtggcc gcgctgggcg accacagcga tccggtgctg ctcgcccagc 

118351 tcgccggggt cgatgaaatc ggtttcgccg gtgcccgccg cgcgctggtg 

118401 gacgccggcc tgttggcccg gggacgggac gtccgcttcg tccacggcgt 
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118451 cgtccgcgat gcggtggact 
118501 cgcacgacga cgccgccgat 
118551 caggtcgccg gccatctgct 
118601 ggaggcggtc ctgcgctccg 
5 118651 ccgccgacgc ggcccggtac 

118701 caggacggct gccgcgcccg 
118751 cgccctcgac cccgatgcct 
118801 tgctggacac ctcgcgggat 
118851 tccctgctcg ccgcccccag 
10 118901 cgccgccggg ctcgacgaac 
118951 aactcgccct gcgcctggag 
119001 cccgtcgagc tggcgtcctc 
119051 gccgccggtg gacagcgtcg 
119101 gtgcgggcgc gctcagcggc 
15 119151 ggcaaccgca tcctcgaacg 

119201 cccgctgccg ctggtgatgc 
119251 gcgtggcctc ctggctggcc 
119301 accggcgcgg acgatgtgct 
119351 gacgcagggc cgcccggccg 
20 119401 tcatggacgc cggcgactgg 

119451 gtcgccttcg aactgcgcga 
119501 gatccgcgac cgccggccgg 
119551 tgctccaggc cgccgtcgac 
119601 gacacgctgc tggcctgtgg 
25 119651 ttccgcactg ctgccctggc 

119701 tcggcgagac cgatgccgcg 
119751 gcccgggagt ggggtgcgac 
119801 gggctggctg ttgcaggacg 
119851 agatcctccg cgcttcgtcc 
3 0 119901 gtcctcgggc gtcggctgcc 

119951 ggaggccgcg gggatcgccg 
120001 gggccgaact cggtctgggc 
120051 acccccagcg agcggcgagt 
120101 ccaggccatc gcgaccgaac 
35 120151 acctcaccag cgcctatcgc 

120201 gtgaatgcgc tcccgggtcg 
120251 atcacaatcc tctgtgcctg 



ccctgctcac cctcgacgag cgggaacgct 
ctgctgtacc gctgcgggcg gccggccgag 
ggccgtggtc cacccgggcc ggccctggtc 
cggcccacaa cgcgctgcgc gccggccggc 
ctgcgccgcg ccctgctgca ccaccgcacc 
catcctggtc gatctggcca ccgccgagcg 
gtgtacgcca cgtcagccag gcggtcgcgc 
cgggccgccg ccgtgttgcg catcccgccg 
cccgtccgcc gtcgagttgg tgcggcaggc 
cggggcagcg ggacgaggag ggagccgacg 
gcgtggctgc ggcactccgg ccacgagaac 
ggtggcgcgg ctgcggcgca tgggggcacg 
ccgaacgcga gttggtcgcc gtgctgttga 
cggctcagcg ccgcggagat cgccgacacc 
tgagccggcc accgccgccc atgcccacac 
tctcgctgtt cgtcgccgag tccgtgcagg 
agcgaacagc acacccggcg ccggtacgcg 
gctgaccgcc gagcgggcct tcgtcctggt 
ccgcgcggga gcacgtcgaa cgcgccctgg 
tcggaacccg ccgtcatgat gttcgccgcg 
cccggccttg agcgaacgca tcctggaacg 
ccggactggc gctcaccgcc accggtcaga 
gtgcacttcg gccgggggcg ggacgccctg 
ccgacgcctg gagaccgtgg gatggcgcaa 
gtccgtatgc aatcgggctg caccagcggc 
ctgcaactcg ccgaggacga gctgaggtgg 
gacgaacctg ggccgggccc tgcgcctgaa 
agggactgga tctgctgcgt gagagcgtcg 
tacgcgacgg aactggcgcg caccctcgtc 
gggtgggccg gaggccgagg cggtgttgcg 
ccgcctgtgg tgttccctgg ctcgccgaac 
agtgccatcg tgccgccggt cgccaccctg 
ggcgtcgctg gtgagccggg gcctgaccaa 
tcggtgtgag ctcccgggcg gtggagaagc" 
aagctcggcg tctccggccg ccgcgagttg 
ttgacgcccg cggccgtccc gttccggacg 
tcagccctgt gggtgcggcg tcgttgtcgt 
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120301 acgcccgtcc ggtcatgaca 

120351 gggggcaggt gcatccaccc 

120401 cccaccgtag ttgggactct 

120451 ggatttaggg gtgcgtagat 

5 120501 actacgctcc gtttcacgga 

120551 aacttcgatc gaattctgga 

120601 ggggttatcc gaaaggatgg 

120651 ctcgaatcac cgtacgaccg 

120701 ccggggagtc cgtcggcccc 

10 120751 accatcaagc acgccaacca 

120801 cggtgacgtc tgcggtcgca 

120851 agcagccgct gatgcgccag 

120901 cgcttcgcct cgcacgtggt 

120951 gggcaccctc accgcctccg 

15 121001 ggatcctggt gctcatggac 

121051 ggcgtcgtga ccagccagaa 

121101 cctcgaaggc atcgcggccg 

121151 tctacctcag ccgccagggc 

121201 aagttgcggg tgcccaaccg 

20 121251 gggcatcctg aacgtcggca 

121301 tcaagtgacg cccggccggc 

121351 cgctgtcccg cacgcccgtg 

121401 cccggcggca cgggcgtttc 

121451 gttacgaacc cgcaatgctc 

25 121501 gtcgtaacca aactgcccgt 

121551 ccgcggtgtg ccacctgggc 

121601 agcggcctcc gaatactcac 

121651 ccgcgcgtca cattcctccg 

121701 cccgccgccg cgacagccat 

30 121751 tcgacccgtc cgcggtgcgg 

121801 cgtgccaccc cgaactctgt 

121851 gcccgttcgg tggcgaatcc 

121901 tcggccgccc accacggccg 

121951 acccagtcgg ggcgccgttg 

35 122001 atgctgaggg ccgccgcagg 

122051 tccgccgatg tcgtgcgtct 

122101 ggtccgccgg gacctgaacg 



gcaatcctca acaaaagctc aaacggaagc 
catcaccaat gtctcgtatg ttacttttgg 
gcggaccgta cctcgtcaag tacggaccac 
gtgtcgcgtt gacgtgccga gttgtggcta 
aatcgaaact cttcggcgtg aatcgcggcg 
ttccgtttct cagaaggcta aaagacgacg 
cggtctcgtg actatcactc acctcactga 
gcggggtgat atccgctcag accgcaccgg 
gggctgatgg cgtccctcga ccgtgacctc 
ggagttccgc cgccgcttcg acgattccgc 
gcttccggga cctgatgcac ccgagcgtgc 
ttctcccggc tgatcgaggg caagcggcac 
cgccgtgggc gcccaggacg cggccttcgc 
cggtcaccgg caagaccccg gacatcgccg 
tcgtccggcg ccgccgacgc cgccgatgcc 
gaagttcctg accgagatcg atgcgcgcat 
ggctgtcgac cattccgctc gcctcgcggc 
gtggagtacc acgtcaccgg gctgctgcgg 
cgccgcgctc gtctcgcgcg cctactccat 
cctggccccc gaaggtcgtg gacgacttca 
ctgtcccgca ccggccggcc accgtcggac 
tcgccgacgc ccgcctctca cggggcgatg 
gtgccctggt cgatgtttat ccattctcac 
atgcgcgcga ttccggccac ttttcgttcc 
catatcgttt ccactgaccg tttctgtggt 
cccactccga cgcgcccgcg acgggtcgcg 
ggtaaccccc gttcctcctg gtggagttcc 
tgcctcgtgc ggcgtggctc gcccgctttc 
tgttacaacg gcacgacaga gcgggggtgg 
cgcagtcaag cggaacctct gctcgatgtt 
gttcttggtg accgttttca ccaatgcccc 
gccgcccgtt cctggccaca acggatcgct 
gcgcagccga ggcttgttgg aatcaccccc 
tggcccggac acgcaggagg aacaccgtgg 
gacatgctgg agctgatccg gcgcagcgga 
cgccgaggag ttcgccgtca gcaaggaaac 
tcctggaggg ccatgggctg atacggcggc 



SUBSTITUTE SHEET (RULE 26) 



WO 01/59126 PCT/GB01/00509 

140 

122151 ggcacggtgg cgcctatccc atggtgcggc cgggctccga ggccgtcttc 

122201 gtctcccgga ccgcacagcc gatcccggag gagtcccgga tcgccaccgc 

122251 ggccgccgaa ctcctcagcg aggccgagac ggtcttcatc gacgagggct 

122301 tcaccccgca actcatcgcc gacgccctgc cgcgcgaccg gccgctgacc 

5 122351 atcgtcaccg cctcgctccc ggtggtcagc gccttcgcga cgagcccaca 

122401 ggccaacgtg ctgctcctgg gcggccgggt ccgccggggc acgacggcca 

122451 ccgtcgacca ctgggccgtc cacatgctgt ccggcttcgt catcgacctg 

122501 gccttcctcg gcgcggaggg gatctcgcgc aggtacggcc tgaccacccc 

122551 t cgacccggcg gtcgccgagg tcaaggccca ggccatccgc gtcgcgcgcc 

10 122601 gcccggtcct cgccggggtg cacaccaagt tcggcacggc gagcttctgc 

122651 cggttcggag aggtgggcga cctggagacg atcgtcaccg gcgccggcct 

122701 gcccgtcgcc gaggcccacc gctaccacct catgggcccc aaggttttac 

122751 gggtgtgacg ccgccggggc gtcccgccgc ccgcccacgg cctggcgcgc 

122801 gccggtgcgg gtcagccgcg tcgcgcgtcc gtcggttccc gccagagtgc 

15 122851 gagcggtgtc acccccaccc ggtagccctc ctcgtcgagg agatggccgc 

122901 cgtgcttcat gtgccagagg gcgaccgcgc gttcggtgac ggtcaggtcc 

122951 gcggcccgtt cggcgagtcg gacctcgaag cgctgggcgt cgtcgaggga 

123001 gcgcacccag gccaccaggt gcaggttgtg ccggctgatg acgctcgcgc 

123051 acaggcggac ctcgcgcatc ccggtgaccc ggcgggtcac ctcgcgcagc 

2 0 123101 ctcgcggccg ggacctgccc ccagaagctc accgtgaccg gccactcgga 

123151 cagcgggcgg gcgacctcgc accgggcgtg caacatgtcc gcggcgaaga 

123201 gccgttggac gcgacggcgc acggtgtcgg ggccggcgcc gcactgctcg 

123251 gccagcgcgc ggtaggtggc ccggccgtcc acggacagcg cggtgaccag 

123301 ctgttggtcg agctcgtcga ggacgaaggc gggggtgtcc gtgcggtgtc 

25 123351 tggacgcgtc ggcggcgagc cgggcgacct ggtgccggcc gagggcccgc 

123401 aggcgccacc ggctgccctc ggtgtgcacc gggccggcca ggtgcgtccg 

123451 ggcggcccgg acgccgtcca acgcggcgag gtcgtgggtc acccaacggg 

123501 agagcatggc cggatcgcgg gccatgacgt tgagttggag gtcacggtcc 

123551 ccggtgacgt gggagagcgc caccacgtgc ggcgcggcgg ccaacgcccg 

3 0 123601 cgcgacgtcg agcagtcggc cgggagcgca gtcgatctcg acgaacgcca 

123651 ggcacccctg gcccgacgcg gccagcaccg gggccggatg gcagctgatc 

123701 caggcggcgc cggtctcgac cagccggttc cagcgcctgg ccaccgtcac 

123751 ggcgtccagt ccgaggacgg agccgatccg ggtccagctg gcccgcggcg 

123801 tgatctgcag cgcgtgcacc agggcctgat ccacatggtc cagggaccgg 

3 5 123851 ggggtctgcc cggaatcctg cgccacgggc cacctccttg cgtgtttccg 

123901 gcggattcgg gccgccggtc ggctcaacct tcagcctgga ctcgggtacg 

123951 gccggaccgt accaggcaac ccccggagca acaggagtgg gattcggcat 
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124001 


gtcggcagtg 


gaacgcgcca 


124051 


gccggtcgct 


gcaccgggag 


124101 


caggagaagg 


tgctcgcggc 


124151 


gggcaccggc 


ctcacctcgg 


124201 


gcggcgcagt 


gctgctgcgc 


124251 


gagagcggtg 


tgccgtacgc 


124301 


cggacacgac 


ctgcacaccg 


124351 


ccgaacgccg 


ggaacgcctg 


124401 


ggcgaggagg 


gccacaacgg 


124451 


ggacgcggcg 


ggcaagccgc 


124501 


ccaaccaggt 


gcccggcggc 


124551 


tcggcctccg 


acctgctgac 


124601 


ctcgacgccc 


gccacggcca 


124651 


tcaccgcggt 


gcagaactgg 


124701 


gtcgtggtca 


ccgtcggcac 


124751 


ccccgacaac 


cgccgagttc 


124801 


gcccgggacc 


ggctcgccga 


124851 


gcggccgccc 


acggcctgga 


124901 


ggtcaccgtc 


aacgacacgg 


124951 


gcgacctgtt 


cgcgcccggc 


125001 


ggctccgagg 


acttcgcgta 


125051 




gccgcccccg 


125101 


accactcgcc 


gcgcgccgtc 


125151 


gcgctgctga 


ccgaactcgc 


125201 


cgagcccgcc 


gtggcgggat 


125251 


cgcgtgcggc 


gtacgggccc 


125301 


cggggcactc 


cttccacgcg 


125351 


gtggagtcat 


ggccatgtag 


125401 


t 





30 

SEP ID NO. 36 



aggaaatgca acccgagttg gagcggctgc 
ccggagttgg gccttgcgct gccgcgcacc 
gctcgacggg ctgccgttgg agatcacccc 
tgaccgccgt cctgcgcggc ggccgccccg 
ggcgacatgg acgcgctgcc ggtcaccgag 
ctcggagatc cccggccgga tgcacgcctg 
ccggactcgt cggcgcggcc aggctgttgg 
cacggcgacg tggtgttcat gttccagccc 
ggccggcgcg atgatcgagg agggcgtgct 
tcgacgcggc gtacgccctg cacgtggcgt 
gtggtcatca cccgctccgg caccatcacc 
ggtgacggtg cggggcgagg gcggccacgg 
aggacccggt cccggcggcc tgcgagatgg 
gtgacccgcg ccttcgacat cttcgacccg 
cttccacgcc gggacgaagg cgagtgtcat 
caggccacgg tgcgcagtta ttccgaatcc 
agggaactgc cccgtctggt gcgggccatc 
agcggaggtc gactacctcc ggcactatcc 
acgagaccgc cttcgcggtg gccaccgccc 
gaggtgtggg agtcgccgat acccaccaac 
cgtcctccgg cgcgtccccg gcgcgatgct 
aggggagcga ctggcagcac gcgccgatga 
ttcgacgacc gggtgctgta ccggcaggcg 
cgcgcgccgg ctcgcggcca ccgcgccggc 
gacaccgccg ccgcacgcgt ggggcccgca 
tggccggtct ccttcccggc tacttgcgcg 
aagtggtaga gggtgttgat gctgccgtcg 
ctggtggcct tggggtcagc cccgctgccg 



NysDII 

3 5 1 msftypvsmp 

51 fgvacssgtt 
101 vdcgddlnid 



wlqgreldyv teavgggwis 
altlalralg vgpgdevivp 
vsrieekitp rtkvimpvhi 



sqgpyvrrfe eafaayndvp 
eftmiasawa vtytgatpvf 
ygrqcdmdav lnlayeynlr 
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151 vvedsaeahg vrprgdiacf slfankiisa geggvclthd phlaeqmahl 

201 ramaftkdhs flhkklaynf rmtnmqaava laqteqldti lalrrdiekr 

251 ydealrdipg itlmpprdvl wmydlraerr delcaylage gietrvffkp 

301 msrqpgyfsa dwpalnaarl sadgfylpth tgltaqeqef itgrirafyg 

5 351 va 



SEP TP MO. 37 



Nysl 

10 





1 


mdneqklrdy 




51 


spedlwrmve 




101 


f fgispreav 




151 


qdyrvgpadg 


15 


201 


avhlatqalr 




251 


dtadgtgwae 




301 


pngpsqqqvi 




351 


qnrpadrpll 




401 


pthhvdwsqg 


20 


451 


epqpedaata 




501 


aldtgrrpdl 




551 


paaapawitg 




601 


ldaf tphldr 




651 


rpdhlaghsv 


25 


701 


qaseadvapl 




751 


ktrrlrvsha 




801 


aqltsadywv 




851 


ldaadadavt 




901 


rtdlptyaf q 


30 


951 


tlltgrlsrq 




1001 


tlaaplllpe 




1051 


gvlttadasr 




1101 


drladggfgy 




1151 


ldaalhaapf 


35 


1201 


altvadgtga 




1251 


atdpgpvall 




1301 


agtgddaadp 



lklatadlrr trrrvhkles 
agehgvtpfp tdrgwdleal 
amdpqqrvvl esaweafera 
aegfqltgnt gsvlsgrisy 
agectlalag gvtimsgpgt 
gvgilvlerl sdavrnghei 
qqalvnarla agdidvveah 
lgsvksnlsh tqaaagvagv 
avrlltdttd wpatgaprra 
qddaagstpa tapvvpgvvp 
ldlahslatt ragfehravl 
ttraetrlav lftgqgaqrl 
plrevlwgtd aalldrtaya 
geivaahlag vlsladaatl 
laghedqvai aavngpsavv 
fhsplmepml dafravvedl 
dhvrhavrfa dgidwlarhd 
lpalragrpe ehtlttalag 
rrrywpkalq sgtadlrsvg 
thpwladhtv rgttllpgta 
qggvqvqlwi gnpdvsgrrt 
qlpasseqgg tplagdphpa 
gpvfqglraa v/rggdvvyae 
tglgergrgg lpfswegvsl 
pvlsvdslvl rsvatqqldt 
gadpfgllth agfadapayp 
arsahrcaae alaavqtwld 



aaqepvaiig mtcrypggvr 
aaaptasggf lhdapdfdad 
gidptsvkgs rtgvfigama 
tfgtvgpavt vdtacssslv 
fiemgrqggl sadgrcrsfg 
lavvrgtavn qdgasnglta 
gtgttlgdpv eaqallatyg 
ikmvmamrhg tlprtlhaee 
avssfgisgt nahtiieqap 
vllsgrtpda lrgqaaalra 
latdhpaltd gltaladadd 
gagrelaarf pafataldaa 
qpalfaveva lyrliesfgv 
vaargrlmqa lpdggamiav 
lsgaeatvta laeqlaadgr 
tlqppllpvv snltgkpatv 
ttaflelgpd gvlsamaqdc 
lhvhgatldw tgcfagtgar 
lgaahhplls aavsladagg 
flelavragd evgcdrveel 
vnvharpdtg ddtpwtahat 
ldaaqwppag aeplpldghy 
velpeagrsd aeafglhpal 
haggattlrv rltpvaddal 
aaavardalf rldwtpvqpt 
dlaalaaadg pvpttvvlsl 
hherfaaarl vfvtrgatvg 
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1351 rdvaaaavwg lvrsaqsenp 

1401 avrgdvlrva rlvrrpltev 

1451 tgglgavlar hlvaeygvrd 

1501 vacdvtdraa vvelvgghav 

5 1551 vdavwhlhea trgldldafv 

1601 rraeglpgls lawgpweqsg 

1651 fdaavagtda tcvpvrldlp 

1701 agnlaqrlrr ldedgrdemv 

1751 fdsltavelr nrlntvtglr 

10 1801 atvqpaavav addpivivgm 

1851 nrgwdvesly hpdpdhpgts 

1901 sqqrllless weaieragid 

1951 gfqgsgssps lasgrvaytl 

2001 ecglalaggv tvmstpavfv 

15 2051 gvlvlerqsd avrngheila 

2101 alasggltag dvdvveahgt 

2151 svksnlghtq aaagvagvik 

2201 ellseqaawp etgrvrragv 

2251 rtpgavpvll sgrgrsalrg 

20 2301 feqraavvaq drdqliaslg 

2351 sqraamgrel hevqpefaaa 

2401 lldetgwtqp alfavevalf 

2451 tledacrlva aratlmqalp 

2501 pqsvvisgde eaaetiaatf 

25 2551 vaegltyrap riplvsdltg 

2601 lrdagattfl elgsdgllta 

2651 alarlqvrgv dvdwaaylag 

2701 aadpadqqlw aavergdare 

2751 ekhlldtlry rvewtrlskp 

30 2801 algshgarvr rlllddscad 

2851 addcppltrg laltvalvqa 

2901 qaaawglgrg valehprlwg 

2951 dqvalratgv sgrrlvrhtv 

3001 wlaragaqhl vltsrrgpda 

35 3051 aavltalpee lpltgvvhta 

3101 dalladreld ffvlfgsiag 

3151 atsvawgpwa eagmaaddav 



gcfalvdldp dgavgaaalv aalvsgepql 
gagadgtgdg vgdgsgvsfs gegavlvtgg 
lllvsrsger avgagelvae lagvgarvrv 
savvhaagvl ddgmvgaltg erlsavlrpk 
vfsslagvfg spgqanyaaa nafldalmtr 
gmtgtltdvd aerlarsgvp plsvaqglal 
vlrargevpp llrslirvra rraavagsat 
ldlvrgqval vlghatggdv dagrafrdlg 
Ipatlvfdyp tvrhlatyvl dellgtdaev 
acrypggvss pedlwrvlte gtdavsgfpt 
ytrsggflhe agefdpgffg msprealatd 
pvslrgsrtg vfagvmysdy samlaspefe 
glegpavtvd tacssslvam hwamqalrsg 
dfarqrglsp dgrckafada adgvgwsegv 
vvrgsavnqd gasngltapn gpsqqrvirq 
gttlgdpiea qallatygrd reperplllg 
mvlamrhgvv prtlhvdaps shvdwsegav 
ssfgisgtnv hviveqapga kaiaaagaar 
qaarllghlq arpdaelvdv alslattrsr 
alaadrpdpa vvegeaagrg rtavlftgqg 
fdavcavfdp lldrplrevv faedgsdeaa 
rlveswgvrp dfvaghsige iaaahvagvl 
tggamiaiqa tedeiaahld dtvaiaavng 
aergrktkrl rvshafhspr mdgmldafri 
rraddaevct aeywvrhvre avrfadcvrt 
maedtlgddh daelvpmlra graeelaaat 
tgarrtdlpt yafqhayywp qlptpaaala 
ladilglgeq dltpldsllp altswrrgnq 
tapvldgtwl lvasdataad qpalldglad 
ravlaerlar tadvdaatqv lsvlplderd 
ladtgaqgrl wtatrgavst npadpvthpv 
glvdlpqvfd eragqrlagi lavkdapdge 
ealptaaeft atgtvlitgg tgglgaevar 
pgaaelrael egygpsvsvv acdvadrdal 
gvghygpldt lstaefaglt aaklagaahl 
vwgsgnqsay gaanayldal alhrrargla 
setlrrqglg lldpapamte lrravvrqdv 
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3201 tvtvadvdwq ryaplftsar psaliaglpe vralaadert eqdatgasev 
3251 vtrvralaep eqlrlltdlv rtesatvlgh ssadavpegr afrdvgfdsl 
3301 tavelrkrlg aatglslpst mvfdyptple laqylraeil gavlevagpv 
3351 atggaddepi aiigmacrfp ggvsspeqlw dlvasgtdai sefpvnrgwq 
5 3401 tghlfdpdpd rpgttystqg gflheadefd ptffgispre alvmdpqqrl 
3451 llettwesfe ragirpetlr stltgtfvgs syqeyglgag dgteghmvtg 
3501 sspsvlsgrl syvfglegpa vtvdtacsss lvalhlacqs lrngesnlav 
3551 aggatimttp npfiafsrqr alakdgrcka fsddadgmtl aegvgvvlve 
3601 rlsdaqrngh pvlavlrgsa inqdgasngl tapngpsqqr virqalanar 
10 3651 lapgdidale ahgtgtplgd pieaqalfat ygrdrdpesa lllgsvksni 
3701 ghtqsaagia svikmvmalr hselpptlha dapsshvdws agtvrlltqa 
3751 rawpetgrpr raavssfgis gtnahvlleq apvadtpaee rpavapvpia 
3801 agvvpwvvta rsaaalrgqa erllahaetv gtalpaagpl diglslvsar 
3851 arfehravvv ppagtdplaa Iravatdgps pvvargvadv egrtvfvfpg 
15 3901 qgsqwvgmgs qlldesavfa eriaecaaal aeftdwslvd vlrgvvgaps 
3951 lervdvvqpa sfavmvslaa lwrsrgvlpd avvghsqgei aaavvsgals 
4001 lrdgarvval rsqaigrala grggmmsval svdvleprlv efegrvsvaa 
4051 vngprsvvva gepealdalh arltaddira rriavdyash shqvedlhee 
4101 llevlaelap rtsevpffst vtgdwldtar mdagywfrnl rgrvrfadav 
20 4151 adllaaeyra fvevsshpvl smavqeaide agvpavaagt lrrdqggtdr 

4201 fllsaaevfv rgvdvdwagl fegtgasrid lptyafqheh lwavppapea 
4251 vaaadpddaa fwtavedgdv saltaalgtd edsvaavlpa ltswrrarrd 
4 301 rstvdawryr vawkplggtl phpsltgtwl lvtadgiddt dvagaletyg 
4351 aevrrlvlde ecvdravlre rlagaedvtg ivsvlaaaer tdavpgtslv 
2 5 4 401 lgtaltvali qalgdaeida pvwaltrgav stgradelta pvqaqvtgig 

4451 wtaalehpqr wggtldlpaa Idaraaqrla avlsgalgsd dqlairpsgv 
4501 ftrrivraea tagrpagtwt prgttlvtgg sgtlaphlar wlaqrgaehl 
4551 vlisrrgtaa pgaaelvael aesgteatva acditdrdav aalladlkad 
4601 grtvrtvvht aatielhtld attladfdrv lhakvtgaqv laellddeel 
30 4651 ddfvlyssta gmwgsgahaa yvagnaylaa laehrrangl palslswgiw 
4701 addlklgrvd pqmirrsgle fmdpqlalsg lqralddnen vlavadvdwe 
4751 tyhpvytsgr ptplfdevpe vrrltaaaeq sagtvaegef aaalralsda 
4801 eqqrtlletv rteaasvlgl ssaedltdqr afrdvgfdsl tavglrnrla 
4851 svtgltlpst mvfdypnpaa laaylhgela garsaaagaa avptgapdad 
35 4901 dpiaivgmsc rypggvgsae dlwrialdev daisgfpadr gwdaeglydp 
4 951 dpdrpgrtys vqggflrdva efdpgffgis prealsmdpq qrllletawe 
5001 afehagidpv gqrgsrtgtf vgasyqdyas gvpnsegseg hmitgtlssv 
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5051 


lsgrvsylfg 


f egpavtldt 




5101 


imstpmsf vg 


f srqralaed 




5151 


ranghqvlav 


irgsavnqdg 




5201 


idvleghgtg 


talgdpieaq 


5 


5251 


asgvasvikl 


vralqegvvp 




5301 


tgrprraavs 


sfgisgtnvh 




5351 


geaalraqaa 


rllafveerp 




5101 


tltrglrals 


dgrpdpglvq 




5451 


pvf adaldev 


larlddgpdr 


10 


5501 


vevalf rllt 


swgltpdyla 




5551 


lmqalpegga 


mvaleaaede 




5601 


dvllladlfa 


adgrrtkrlr 




5651 


ipf vsnvsgg 


lataeqvrtp 




5701 


elgpdgvlsa 


maresltdps 


15 


5751 


rvdwsgyf ad 


hgarrttlpt 




5801 


aaverddvaa 


laasldldda 




5851 


wkprggatap 


aaltgrwlvl 




5901 


Itvtttdraa 


laariteaag 




5951 


tttavqalgd 


agidaplwnv 


20 


6001 


elparfggtl 


dlpatldgqa 




6051 


hapagpdtar 


tafdpaagtv 




6101 


rgpaapgada 


lraeleelga 




6151 


vf htagvved 


hvvdaltpen 




6201 


sstagvlgaa 


gqgnyaaana 


25 


6251 


adaaeltdrv 


rrggfeplap 




6301 


afaavrplpf 


vadlpetgra 




6351 


llrtqvaavl 


ghadprtved 




6401 


atlvydlptp 


remadf llae 




6451 


pfddpiavig 


igcrfpggvt 


30 


6501 


gagasdtleg 


gf ltgvadfd 




6551 


ragidpttlr 


gsttgvf vgt 




6601 


grlsyalgle 


gpavtidtac 




6651 


aspdsfvef s 


tqgglapdgr 




6701 


nghqvlglir 


gtavnqdgas 


35 


6751 


aieahgtgtt 


lgdpiearal 




6801 


gaagvikmlm 


amrhgtlprt 




6851 


qprragvsaf 


gvsgtnahvv 



acssslvamh lacqslrnge sslalaggvs 
grckayadga dgmtlaegvg Ivllerlsda 
asngltapng psqqrvirqa lansavapgd 
allatygqdr aperplllgs vksnightqm 
kslhidrpst hvdwssgaig lltertpwpe 
tileqapade aptpadpprd glvpvllsgr 
eahltdlahs latsraaler raaviaadrd 
gtagrgrtaf Iftgqgsqrp gmgrelhdry 
plrevlfaap dsaeaalldr tgyaqpalfa 
ghsvgelaaa hvagvlsldd actlvaargr 
vlpllegltd rvsvaavngp rsvvvagvee 
vshafhsplm damlddfaav argltyhppt 
dywvghvraa vrfadgidwl atqgdvhtfl 
rtallptlrg drpeepalvt avaaahahga 
yafqrerywp dttaatsaht pgsaldaefw 
tvtamvpalt awrrrrgeqt eldswryrvt 
vphdhqdrqd dataawaadv etalgtttvr 
dqgpfsgvls llplatgdag hpgapaaltl 
trgavavgra eqvtapeqaa vwglgraval 
arrlravlaa tdgedavalr psgvflrrla 
litggtggig ghvarrlard gathllltsr 
rvtlaacdaa drdalaalla elpddaplca 
faavlraktv aahhlhelta dldlaafvlf 
hldalaehrr shgltalsva wgpwagsgmv 
epavrallra ienddttval adidwerfqr 
tpatatgaat glrqqlaelp eherpaavld 
dhafrdlgfd sltilelrna lnaatglslp 
llgtlptdta atvastaspk lsasfeqggt 
tpeelwqlld egrdgisrfp ddrgwdlaal 
arffgispre alamdpqqrl llettweale 
ngqdyptllr rsasdvagyv atgntasvms 
ssslvalhwa gralragecd lvvaggvsvra 
ckafsdaadg tawsegvgil vlerlsaarr 
ngltapngls qqrviaqala darlrpadid 
itaygrdrda erplllgtvk snightqaaa 
lhvgtpsshv dwsggtvall ddarpwprtg 
veqapeteap aapaaepape atptvvpwvv 
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6901 sgrsrealqa qldrltahta ahparsaadv grslatdrtl fphravllag 

6951 pdgvreaara aaprtpgrta flfsgqgaqh almghdlyqr fpvyadaldt 

7001 vlaqfdtvld vplraalfaa pgtpeaalld qtgftqpalf avevalfrla 

7051 eswrltpdfv aghsigeiaa ahvagvfsle dactlvaara slmqqlprdg 

5 7101 amvaleated evaplltdgv alaavngprs vvvagaedav ravadrlaad 

7151 grrtrrltvs hafhsplmdp mltdfarvae gltyheprip lvstllgapa 

7201 gaelrtpdyw vrhvretvrf adgvralhda gagtfveigp dgvltaltqq 

7251 tldtveagap avvvplqrrd ragdlalleg latlhthgtg pswpayfeat 

7301 gghrtdlpty afqrerywpe lgapvatapq dpaawryhet waplpapeaa 

10 7351 apagralvlv pagnrdtawm tavadalgad tvtaepdala eqltaagdtp 

7401 wrvvvsllaa aseglpadga wpaallatld eagvhaplwc vtrgavavag 

7451 eaptavgqaa lwglgrvaal dhpdrfggla dlpadtdaha agllaahlaa 

7501 pgteaeiavr atgvharrlv rtpaaadgat wlptgtvlvv ggtggtgtmg 

7551 graarwlvre garhlvltap dgtttaadte altaelaalg aritvvdhdp 

15 7601 tapdgfaall dglpddtplt avvyapeada apgtaaelsa alapvtalga 

7651 altgrpldaf vlfgsiaglw gvrgraaeaa sgayldafar acrdrgtpal 

7701 avawgawadl vgpslaahlr mnglpvmdad taltalsrav adgsaaeava 

7751 dvrwetfapl hhearrtalf dalpeargal aeaardradr ktaagdygrw 

7801 laeqpaadhd aillalvtek aatvlghadh dllepdlpfr dlgfdsltav 

20 7851 dlrnqltaet gltlpatlvf dhpnpaalaa hlraqllgea sdsaapvaap 

7 901 valgadddai vivgmacryp ggvtspedlw qlvgdevdav gdfptdrgwd 

7951 laalagdgpg rsataqggfl ydatdfdpgl fgisprealv mdpqqrille 

8001 tswealerag idpatlrgsg ttgvfvgggs gdyrppeeag qwqtaqsasl 

8051 lsgrlaytfg iqgptvsvdt acssslvalh laaqalrage csialaggvt 

25 8101 vmatpvgfve fsaqgalspd grcrafsdda ngtgwsegvg mlvverlsda 

8151 rrnghrvlav lrgsainqdg asngltapsg paqqrvirqa lanarlrpad 

8201 idaveahgtg trlgdpieaq allatygqdr erpvllgslk snightqaas 

8251 gvggvikmvl amqhgelprs lyaenpsshv dwtagrahll tartpwpdsg 

8301 rprraavssf gasgtnahai leqppreelp arpaddgapl pfllsgrsqn 

3 0 8351 alraqarrll arltahpdtr aadlayslat traafehraa itatdhdglr 

8401 tgltavaegt taphtaehhl qgtgkravlf sgqgsqrlgm grelherhpv 

8451 faeafdsvla rlddrldtpl rdvvwgtdee alhatgntqp alfavevaly 

8501 rlieswgvrp dfvaghsvge laaahvagvl slddacrlva araalmqrlp 

8551 aggamiavea tedevtpllt dgvslaavng ptavvlsgag davtalgqal 

35 8601 aerghrttrl rvshafhshl mdpmladfrt vaegleyhpp ripvvsnltg 

8651 dvadaadlcs adywvrhvrg tvrfadgvrt madrgvhlfl elgpdavlsa 

8701 marqcapdav vvpalrrnrd edetlvgava rlhvhgagpr wdayfagrga 
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8751 qwldlptypf qrgrfwpesl 

8801 alesvldves dalskvlpal 

8851 ahrkalsgtw lavvpegfad 

8901 aiaartadgt rfggvislla 

5 8951 plwcvtrsav sagrtdrphr 

9001 pdersaagla avlagldged 

9051 tvlvtggtga lgahvarrla 

9101 gtdvtvaacd vadrdqltav 

9151 erfqevfrak vtsallldel 

10 9201 navldalaeq rrvlglpats 

9251 phlaveallr lvaekeptav 

9301 alaaqaeqaa dggglaarla 

9351 vaaersfrdl gvdslgavel 

9401 lgglfpdepa gsddeteira 

15 94 51 dgddgesvds mtvadlvraa 



pgaasaapaa gqpaetdaaf wdavaqedft 
mdwrsrqade sqlagwrhri vwkrltgaal 
dpwvtttldg lgthlvhlev aeadraalad 
lreeltgavp egtaltttll qalgdagvda 
plqgavwglg rvaaleypqr wgglvdlpee 
qvavrgtavl arrlvpapgr kpsrpwhpsg 
kdgaqhlvll srrgpdapga aelraeldal 
ldalpadrpl tgwhtagvl ddgvldrltp 
trdrelaafv Ifssasaavg npgqanyaaa 
vswgawgggg madadgadea arragvgamd 
vaevaldrfa gafggsrpsa llrefpgyre 
alpparrldt vvdlvrtraa qvlgypdtea 
rnqlsaatgl nlpatlvfdh ptplvlgehi 
llasvpldql reigvlepll qlagrggraa 
lngqsdl 



.qPO TP NO . 38 



NysJ 

20 

1 mnapenpetp ennvvaalra avketdrlrr qnrmlvaaak epiavvgmac 

51 rfpgavdspe alwemvatgt dvisgfpddr gwdlealrns gtdardtdvs 

101 qrggfldcia dfdpgffgis preavtmdpq qrlllttawe averagidat 

151 tlratrtgaf igtngqdyay llvrslddat gdvgtgiaas aasgrlsytl 

2 5 201 glegpaltvd tacssslval hlavqalrng ecgmalaggv nvmatpgslv 

251 efsrqgglar dgrckafada adgtgwsega gvlllerlsd aqrnghpvla 

301 vvrgsavnqd gasngftapn gpsqqrvirq alanaglatg didaveahgt 

351 gtplgdpiea qsilatygqd rahpvllgsi ksnmghtqaa sgvagvikmi 

401 mamrhgvlpr tlhvdrpsth vdwttgsvel ltdahpwpet grprrtgiss 

30 4 51 fgvsgtnahv iveqapdtpa eaaddtpprt prtlpwllsa rtgaalrdqa 

501 talldhldrp dgdrgptald tafslattra alehrlavvt gtdgtagrda 

551 ltawlahgta pdaheghaag rtrcaalfsg qgaqrlgmgr elharfpvfa 

601 raldtavdll daelggtlre viwgtddapl netgftqpal favevalyrl 

651 ieswgvapdf vaghsigeia aahvagvfsl edactlvaar aglmqalprg 

35 701 gamvaveate devsplltdg vaiaaingpt slvvsgdeta tlavaarlae 

751 qgrrttrlrv shafhsplmd pmlaefrava eglsygepqi pvvsnltgav 

801 adgtllgtad ywvrhvreav rfadgiralt dagvgaflel gpdgtlaala 
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851 


qqsapdavsv 


pvlrkdrdee 




901 


tdlptyafqy 


erywpkatyr 




951 


ltgtlslath pwladhvvgg 




1001 


aaplilpatg 


tvqmqiavga 


5 


1051 


itegervlal 


dtttwpprda 




1101 


rrdteiyaev 


alpegtadad 




1151 


f awngvslha 


agadalrvri 




1201 


agpdagtadh 


radagslf rm 




1251 


taagpdtvtg 


lrdgvdalge 


10 


1301 


ltrtvlallq 


ewlaeerfar 




1351 


rsaqsenpgr 


lllvdlddta 




1401 


rlarldsgrg 


lvpppgtpwr 




1451 


vgiraaglnf 


rdvlnalgmy 




1501 


vmgmlfggfg plgiadarll 


15 


1551 


glragekvlv hagaggvgma 




1601 


ddhiassrtl 


dfeaafaeva 




1651 


lemgktdira 


adsvpdglsy 




1701 


alpvrswdvr 


rageaf rfms 




1751 


lagllarhlv 


tehgarhlll 


20 


1801 


dvadrtalaa 


llatvpaehp 




1851 


kvdaawhlhd 


ltrhldlaaf 




1901 


hrhalglpat 


slawgaweqg 




1951 


alydaataad 


eplivplglt 




2001 


gvsraglaer 


laalpeeert 


25 


2051 


qlgf dsltai 


elrnrlgkat 




2101 


apvtvtaaaq 


aadpehdpw 




2151 


gfpadrawdr 


hpqlagapga 




2201 


pqqrilleva 


weaaeragid 




2251 


iaghattgla 


vsvvsgrlay 


30 


2301 


agectmalag 


gvtvmttaan 




2351 


gaavlvlerl 


sdarraghrv 




2401 


rqalanaglt 


pvdvdaveah 




2451 


lgsvksnigh 


tqsaagaagl 




2501 


tvrllterta 


wprtdrprra 


35 


2551 


rpaptvvawp vsaqtpaald 




2601 


ravllatvgd patgapdlpe 




2651 


aafpvf aaaf 


devvavldae 



paavaalarl htagvpvdwt afyagtgahr 
padatglglt aadhpllgaa msvagsdell 
mvffpgtgfl elavraadqv gcdrveelml 
adddggrdlr fftrpgddpd aawaqhatgr 
epvdidglyd ryrangldyg pvfrglravw 
afglhpalfd avlhstlfas adgddrsllp 
tscgpdavei tavdpqgrpv vsvesltlra 
dwtprtvhap atpatwavlg tdpiglteal 
ltagddrpvp dvvavplrga tdhgpagahd 
srlllvtrga vadgergpld laaapvwglv 
esaaqlpllp alldadepqa vvregtvrvg 
lgsrakgsld glallphpea rrpltghevr 
pgdaglfgse aagvvvevgp evtglapgdr 
tpvpadwswe tgasvplvfl tayyalkelg 
aiqiarhvga evfatasegk wdvlrslgva 
gdrgldvvln alsgefvdas mrllgdggrf 
hsfdlgmvdp ehiqrmlldl velfdrgala 
laqhigkivl tvpqpldpdg tvlltggtgg 
agrrgpdapg aaalhaelta lgaevtvaac 
ltavvhtagv lddgtltaln pdrlatvlrp 
vlysstagvm ggpgqanyaa gntfldalaa 
agmtgaltdh dlrrvsdagg qplltaergl 
ggalpagvgv pavlrglvrt agrraragta 
pflvelvrte aatvlghgst dpvdarrefr 
gltlpatlif dyptpdrlav hlhdellgad 
ivgmscrfpg gvsspeelwd lvasgtdait 
rtgqggflrd iadfdaaffg isprealamd 
pqtlrgsdtg vfmgvsgqdy aglvmrsrdd 
alglegpals vdtacssslv slhlaaqalr 
ftgfsrmggl aqdgrckafs dsadgtgwse 
lavvrgsavn qdgasnglta pngpaqqrvi 
gtgtplgdpi eaqaliaayg tdrdpehpll 
vkmvmamrhg ilpqtlhlte psshvdwsag 
gvssfgisgt nahvileqpp aeptpaadpg 
aqldrlrtaa alapldtaht latgrslfeh 
vargaatphr taflfsgqga qrsgmgrelh 
lgsdadggvs lrevmwgggs elldrtrftq 
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2701 palfaveval frlvaswgvg 

2751 varaslmdal pvggvmvave 

2801 eaavgqvvdq lvergrrvrr. 

2851 pripvvsnvt gevaaaeelc 

5 2901 leigpdgvls alargvlpae 

2951 vdwsaaltgt gargtdlpty 

3001 veradatala ahldidgdql 

3051 plslagtpht ggvlvlvpaa 

3101 aalaalltea addtaptavv 

10 3151 tgapaplwal trgavaalpd 

3201 dlpadldert arrlpaalad 

3251 tgwqptgtvl itggtgalgr 

3301 tteltalgar vtlvacdaad 

3351 vltgltpdrf atvfrakvas 

15 3401 qagyaaanav ldalaarrra 

3451 glldpdlavp alaravtepq 

3501 aartaaravq eadrrragaa 

3551 Ightgadair adkpfrdlgf 

3601 praladhlra eltgdrpesa 

20 3651 tpeefwqlla egrdgidafp 

3701 aafdpgffdi sprealamdp 

37 51 fvgtngqdya glvlraqedv 

3801 dtacssslva lhwavqalra 

3851 pdghckafsd sadgtgwseg 

2 5 3901 dgasngltap ngpaqqrvir 

3951 aqallatygq drpadrplwl 

4001 lprtlhvtep strvdwsaga 

4051 ahvileqppa eptptapadr 

4101 aaerpdlppa dvahslvtsr 

3 0 4151 flfsgqgaqr sgmgrelhaa 

4201 evmwgggsel ldrtrftqpa 

4251 aaayvagvfs lvdacrlvva 

4301 gvaiaavngp vsvvvsgvea 

4351 dpmldafrav aegleyhqpr 

35 4401 vrfadgvrtl aergatafle 

4451 peehtaltaa aqlhvagvdi 

4501 aaqapgdagg lgleagrhpl 



pefvaghsvg eiaaahvagv fslvdacrlv 
aaeaevvpll vdgvaiaavn gpvsvvvsgv 
lavshafhsp lmdpmldafr avaegleyhq 
aadywvrhvr atvrfadgvr tlaergataf 
alvtptlrkd rdeesallag larlhvagvt 
afqrerywpe laaepaggga daadaefwaa 
gavlpalsaw rtrrrttsat nalrhreswe 
attdpwvadv vaalgpdarr vdvpadgtdr 
sllaldetsg ddavpagtta taalvqalad 
eqptapaqaa vwglgriaal elprhwgglv 
agdedqlalr atgaygrrit papapddapg 
htarwlaahg aehllllsrs gpdapgaael 
reqltrvlae vprdcpltgv vhtagvlddg 
avlldeltrd tdlavfalfs svagavgnpg 
qglagtsiaw gawagdgmaa rhtrpgaepv 
ptlvladlqq prllesllal rpspllsrlp 
adlrdqlagt apadrhavll rlvrttaaav 
dsltavelss alaaatglal ppslvfdhps 
paappapvpa adddpivvvg macrfpggvt 
tdrgwdldvl grrrpgpqrp prsaassyda 
qqrllletaw eavertgtdp trlrgsrtgv 
eghagtglaa svisgrlaya fgfegpavtv 
gecslalagg vtvmttstsf agftrqggla 
vgvlvverrs dalrngheil avvrgsavnq 
qalanaglap gdvdaveahg tgtvlgdpie 
gsvksnight qaaagaaglm kmvlalqhgt 
vrlltertvw prtdrprrag vssfgisgtn 
ptrtpavlpw vvsarsatal daqlarlraf 
atfehravll aapdgitaaa raearersta 
fpvfaaafde vvavldaela tgsgggvslr 
lfavevalfr lvaswgvgpe fvaghsvgei 
raslmdalpv ggvmvaveaa eaevvpllvd 
avgqvvdqlv ergrrvrrla vshafhsplm 
ipvvsnvtge vaaaeelcaa dywvrhvrat 
igpdgvlsal aaaclfdtda evvpalrkgr 
dwtavlagtg grrialptya fqrerywpsl 
lgaattvags aeilltgrls ttaqpwlavy 
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4551 


eadgrtvlpa 


avlaelavra 


4601 


rvaapddtgr 


ralsvharpd 


4651 


peravpldal 


ptatgparia 


4701 


lldtavragg 


lldgdatlda 


4751 


eatdpqgapv vsvtgltlgt 


4801 


tggdhlpyav 


lgdqlaeldg 


4851 


apvlgvptge 


gdlpaavrgt 


4901 


aagaedvhdl 


aaapvwglvr 


4951 


tlaalldage 


tqaavradtl 


5001 


itggtgglgg 


llarhlvtgh 


5051 


evtvaacdva 


draaldrlla 


5101 


ldtvlrakad 


aawhlhdatr 


5151 


fldalaahra aqglpglsla 


5201 


tpeqglalf d 


aagargdgfa 


5251 


taaaaghagv 


larrlaalda 


5301 


rdfnrlgfds 


lmavelrtrl 


5351 


pggtaagpdr 


splaeldria 


5401 


rqdgggttvd 


drieaasaee 


SEQ 


ID NO. 3 9 





gdqadcptva eltvaaplvl tgaaaqrlqv 
dspdspwtlh atavlthdtp qppapdtgwp 
aawqwgdelc aeielpepgp aerafalhpa 
lgwrglalha asatalrvrl tpdgtdtwal 
ptvdrsgaga addgatlldl ewvpapqaap 
qlriagdgpg rvaslaalld ggaplprlvl 
ttavlellqr wtadartads hlvivtrgav 
saqsehpgsf llldldpadp agasraaapa 
tvarltraad gpeataghpv rdwdrdgtvl 
gikhlllagr rgpdapgara lrdelaalga 
qlppehplta vvhtagvldd atvgtltper 
drdlagfvly ssvagvtggp gqgnyaagnt 
wgpwgqdagm tgtlgaadla rlersgmppl 
vavrlargaa apgadevpav lralvrgrrr 
eqrhqalldl vrtetaavlg hsgadavpae 
atatgarlpa tlvfdhptpd avarhlastl 
aelspegadd atrqgvvgrl rhllaqwdgt 
vlafidhelg rqads 



NysK 



1 mpdekklvdy lkwvtkdlhq trqrlqevea grhepvaivg macrfpggvr 

25 51 spedlwells agrdgigpfp adrgwdlaal agdgpgrsat qeggflpdaa 

101 afdpgffdis prealamdpq qrllletawe aversgidpa glrgsrtgvf 

151 vgtngqdyah lvlaaqddmg gyagnglaas vlsgrlafal glegpavtld 

201 tacssslvtl hlaaqavrag ecglalaggv tvmttsssfa gfslqgglap 

251 dgrckafaea adgtgwsegi glllverlsd aqrnghpvla vlrgsavnqd 

30 301 gasnglsapn gpsqqrvirq alagaglvpg dvdaveahgt gtrlgdpiea 

351 gallatygqd rpadrplwlg svksnlghtq aaagvagvik mvlalrhgvl 

401 pqtlhvdaps shvdwesgav rlltapvaws egddrvrrag vssfgisgtn 

4 51 ahvileqapd qpeptaeeta aaapggtaee raaapvapra vpwpvaarta 

501 galdaqlvrv ralttapgrt aadvghalat artpfehral lvheggavte 

35 551 vargavptgd rgglavlfsg qgsqrpgmgr elharypvfa aafdetvall 

601 darlgtslrd ivwdqdrtxl ddtrhtqpal favevalyrl laswgirpdh 

651 vtghsigeit aahvagvltl adactlvaar atamselppg gamvaleate 
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701 devrplltdd laiaavnapr svvvagaeda alavrrhfdd lgrrttrlpv 

751 shafhsplmd pmldafrtal apltfaepei pvvsnltglp ataeelatph 

801 ywvchvrqav rfgdgvrala drgvrtflel gpdgvlsalv renlpepglv 

851 avpvlrkerp eettvlaalg tlwahgadvd wdavfagtrt pqadpvelpt 

901 yafqrarywp tlgarhgdpa dlgqtaaahp llgaavtlad adetvltgrl 

951 alpshpwlgd hrsdgritvp gvafaelavr agdlsgtphl arldlpaplt 

1001 lgdgdtvtlq vrvgapdpag hrpltvharl aatedapwtt catgllapda 

1051 peapadpigp adagwpprda rpvpvadlda aataagrhyg phfqgltglw 

1101 rrdgevfaev alptataadr afgihpalla talrataald ddhtaghtpe 

1151 ptgitglalh atgatalrvr ltatgpdtva laaadatgga vltadtvtlg 

1201 spqdrpapap aghtgqgglf hldwvpvdpg sratgtrwav vgddeldlgy 

1251 alhradetvs ayaaslggai gdsglapdvf lvpvvggpda gpdavhavta 

1301 ralgllqewl neprlagarl vfvtrgavav pgetvtdpag aavwgllrsa 

1351 qtenpgslll vdlddafrsa gmlphvltld eqqlvvrdha vraarlarlp 

1401 epaagtapar awdpdgtvli tggtgglgaa larhlvtvrg arhlllagrr 

1451 gpeapgagel vaeltaqgad vrvaacdvgd rtaldallat vpaahpltav 

1501 vhtagvldda ligsltpdql atvlrpkada awhlhdatrg ldlagfvlys 

1551 svsgvlgspg qgnyaaanay ldalarhrad qglpalslaw gpwgrgsgmt 

1601 asvsdadler margglpplt vedglalfda avgrpepalv psrinvaglr 

1651 dqqalpalwr dlvprarrta atadrspvtv rerlrhldet gqeqllidlv 

1701 vgytagllgh pdptavdper gflelgfdsl vsvglrnqla eilglrlpss 

1751 ivfdskspvk larwlhqela ngpqpgatgp aaadarpavr sddtleglfy 

1801 navrggklve amrmlkavan trpmfdtpae leelsepvtl adgpgrprli 

1851 fvsapgatgg vhqyariaah frgsrhvsal plmgfapgel lpatseaaar 

1901 ivaesvlmas egepfvmvgh stggslayla agvledtwdv rpeavvlldt 

1951 asirynpgeg ndldrttrfy ladidspsvt lnsarmsama hwfmamtdiq 

2001 apaptaptll vraaraldgf rldtssvpad evrdidadhl slakehsalt 

2051 aqaiegwlae lpdpaa 

SEP ID NO. 40 

NysL 

1 mstptappsl kaevppvlrl spllrelqsr apvckvrtpa gdegwlvtrh 

51 telkqllhdd rlarahadpa napryvhnpf ldllvvddfd lartlhaemr 

101 slftpqfsar rvmdltprve alaegvlahf vaqgppadlh ndfslpfsls 

151 vlcaligvpa eeqgkliaal tklgelddpa rvqegqdelf gllsglarrk 
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201 ritpeddvis rlclkvpsde rigpiasgll fagldsvash idlgtvlfiq 
251 hpdqlaaala deklmrgave eilrsakagg svlpryatad vpigdvtira 
301 gdlvlldftl vnfdrtvfde pelfdirrap nphltfghgm whcigaplar 
351 vnlrtaytll ftrlpglrlv rpveelrvls gqlsagltel pvtw 

5 

SEP TP NO. 41 
NysM 

10 1 vritvdpgrc vgagqcvlta pdlfdqdddg lvtvlagaad aadpgdvrda 

51 aalcpsgais vaad 

SEP ID NO. 42 

15 NysN 

1 msteadarta apqcpvafpl rrpgrpfppp eyatyrggag Ivrselpsgp 

51 vwlvtrhedv ravltdpris adpskpgfpk agrtggapsq yevpgwfvam 

101 dppehgrfrk tlipeftvrk vrelrpviqq ivderidaml aagtsadlve 

151 sfalpvpslv issllgvpkv drdffedrtr vlvrlsstde erdkatqall 

201 rylgrliqik qrrpgddlis rliaagtlsr qelsgvamll liaghettan 

251 niglgvvqll tnprwigddr iveellryys vadlvafrva vedveiggql 

301 iragegivpl iaaanhdata faapsefdpe rsarshvafg ygvhqclgqn 
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